Opened 9 years ago
Closed 6 years ago
#571 closed defect (fixed)
snark char mapping on torrent creation
Reported by: | zzz | Owned by: | zzz |
---|---|---|---|
Priority: | minor | Milestone: | 0.9.15 |
Component: | apps/i2psnark | Version: | 0.8.11 |
Keywords: | Cc: | ||
Parent Tickets: | Sensitive: | no |
Description
snark maps 'unsafe' filename chars and chars not supported in the local charset to 'safe' chars. This works fine when downloading a torrent file, but when creating a torrent it can't re-find the file it just found if it had to map any chars. This leads to incomplete torrents for the seeder.
We should try to rename the file, or warn the user, or remember the mapping, or some combination of that.
Change History (10)
comment:1 Changed 9 years ago by
comment:2 Changed 9 years ago by
basically, you have to check for the file as-is upon setup/init
just map first, don't do anything else, then create if the possibly translated one does not exist, or, fall thru as usual and check
As for safety checks:
on system 'a' that has the original LF, it maps it 1:1
on system 'b' the file with the LF in it actually opens
This happens because the file did not exist!
When it checks the second time, and the 'translated' one does exist, all is cool upon next time, because again, the name does not exist, and gets mapped to safe unicode representation.
That handles all the possible corner cases even if the user switches their locale.
Note also that you should trim the file name to something sane too, so that you do not exceed the filename length limits.
comment:3 Changed 9 years ago by
<zzz> so you use a reversible mapping using HTML encoding, together with trying to open both versions (on a pathname-element basis I presume), and that eliminates the need for a persistent mapping. ok.
<sponge> yup
comment:4 Changed 9 years ago by
One other possible part of the fix - just don't map chars at all when creating a torrent - it's the leecher's problem, not the seeder's, to map chars.
comment:5 Changed 9 years ago by
Milestone: | 0.8.14 → 0.9.3 |
---|
comment:6 Changed 8 years ago by
Do not listen to sponge. His Robert (incl. 0.0.36-3) can still not handle umlauts.
— Luther Gillis
comment:7 Changed 8 years ago by
Milestone: | 0.9.3 → 0.9.5 |
---|
Sure, whether Robert works or has bugs or whatever, ultimately that has nothing to do with snark. But somewhere in the above is the beginnings of a solution.
The illegal char set includes several chars that are OK on windows but not on linux, and vice versa. As it says above, why remap a char on your local system as a seeder if it is there already, so it's obviously valid, just because it might cause some problem for a leech? It's their problem.
One problem with any mapping scheme is to avoid collisions, e.g. foo_, foo? and foo: all mapping to foo_ and causing general mayhem. Snark neither handles nor catches this case now.
The try-both-and-use-whichever-one-is-there idea in comment 2 above may be part of the solution but it requires some refactoring to make it work. See Storage.createFileFromNames(), getFileFromNames(), filterName(), etc.
New ticket #771 is probably a duplicate of this.
comment:8 Changed 8 years ago by
Milestone: | 0.9.5 |
---|
comment:10 Changed 6 years ago by
Milestone: | → 0.9.15 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
fixed in e8883e85a7761bbda9df59b3f6b57601cc01bb5a 0.9.1.14-7 by disabling mapping on torrents that were created locally, and storing that setting in the new per-torrent config files. This is the simplest fix as we don't have to remember per-file mappings, or design some sort of reversible mapping, which probably wasn't possible anyway.
We need a flag.
Don't remap it if file is there, it's obviously already 'safe'.
As for the replacement gunk, from IRC discussion:
The way Robert 'remaps' is via unicode encoding. For example:
☃.mp3 on a system that can't handle it, renames the file to ☃.mp3
For things like tabs, etc, do the same. UTF-16 encode, problem solved.