Presentation Screenshots Download Support Development Forum    
   

Welcome to the Community Forum.

Here you can discuss with other users or with the author, suggest new features, report bugs, ask for filters creation or correction, etc. Select the forum you wish to read or post below :

Forum
Forum > Wishlist, Feature request > [Done in r86] Sort Based on Domain Subfolder?

Pages : [1] 2 Add a reply
User info [Done in r86] Sort Based on Domain Subfolder?
sciencemandan
Avatar
Dec 24 2009, 1:51 am
Hi Cyan,

First, I'd just like to add my thanks for your extension. It has been very useful!

I just have one quick question: would it be possible to sort files based on a subfolder?

For instance, could you send files from http://www.domain.com/subfolder1/ to one folder and files from http://www.domain.com/subfolder2/ to a different folder? This would really help me out. I tried making a filter, but it didn't seem to work. Perhaps I am doing something wrong.

Thanks so much! You have made a great contribution to the community with your extension!

Dan
Post #1
Edit
Cyan
Avatar
Dec 24 2009, 6:08 pm
Thank you for your comment, I'm glad you think it's very useful :)


About the sub-folder, I think I will add this option when I'll use database (1.1.0), or maybe before but only with a global option, not per filter's option.
Currently ASF is reading only the file's hosted domain to compare with the filter's domain.
I will add an option to verify the domain only or the full file's path (including the filename).


Edit :
In fact, I think it can actually work, as it use the full URL when checking the current website URL :p
When the filter's domain doesn't match the file's hosted domain, it will use the full current website URL to verify again the filter, so you can use it to read sub-domain.
It's not a perfect solution as it will use the current website URL to check sub-domain and not real sub-domain from the hosted file

1) Check "use the current website URL when the domain doesn't match"
2) enable dynamic folder
3) create a filter like this, which analyze scheme+domain+sub-domain+n... sub-domains : ^(.*?:\/\/)?[^\/]+\/([^\/]+)\/
^(.*?:\/\/)? = the scheme (http://) optional because of the ending ? after the parenthesis
[^\/]+ = the domain, change it to your desired domain if you want a particular domain, else leave it to match any domain
\/ = the slash after the domain
([^\/]+) = the 1st sub-folder(without the /), you can replace with a desired sub-folder, or leave it to match any sub
\/ = 1st slash after the sub-folder
Add as many sub-domain ([^\/]+)\/ as you need.

Here is an example :

NOTE : I didn't test it, please let me know if it works or doesn't.

URL = https://addons.mozilla.org/fr/firefox/addon/4781
Filters Domain
All    
= Regexp.
File name
All    
= Regexp.
Local folder


The filter will match on this string :
https://addons.mozilla.org/something/
or
addons.mozilla.org/something/

As the filter is checking only with domain "addons.mozilla.org" it will fail to find "something" string in it, so ASF will use the current website URL instead.
now it's checking on the full path of the website URL
https://addons.mozilla.org/fr/firefox/addon/4781
$1d = https://
$2d = fr
it will save to D:\fr\


if you want to capture the domain, use another set of parenthesis in the filter
^(.*?:\/\/)?(addons.mozilla.org)\/([^\/]+)\/
$1d = http://
$2d = addons.mozilla.org
$3d = fr

you can add up to 9 parenthesis on domain.
Post #2
Edit
Cyan
Avatar
Dec 25 2009, 12:13 pm
I'm making a new post to explain the regular expression.


^(.*?://)?[^/]+/([^/]+)/

Let's decompose it to groups :
^
(.*?://)?
[^/]+
/
([^/]+)
/

It means :
start of the string, followed or not followed by scheme:// followed by letters being not / followed by / followed by letters being not / followed by /


let's look closely :

1)
^ means start of the string, so it will look for the following regexp data from the start and not on middle or end of the URL

2)
(.*?://)?
consist of a ( ) group parenthesis with a ? making all the group optional
inside the group :
unknown number of letters followed by ://
. is "1 character" and * means "repeat previous statement from 0 to infinite time". * is greedy and will capture any letters to find the most possible number of letter, so .* means any characters up to the end of the string.
Using the ? makes it lazy, meaning it will stop as soon as possible after meeting the first occurrence of ://
Sometime URL have 2 time "http://", like http://domain.com/redirect.php?page=http://test.com/page.html

greedy will return http://domain.com/redirect.php?page=http://
lazy will return http://

So here it will match http:// https:// ftp:// mailto:// etc., but remember all this group is optional, so it will match URL with or without the scheme.

3)
[^/]+
followed by [ set of characters ]+
[ ] means followed by any of 1 character from the characters included in the bracket
[^ ] means followed by any 1 character which are not included in the bracket
the + means "repeat previous statement 1 to infinite time" making it necessary
so it will select all letters until it finds a / which will match any domain and sub-domains

4)
/
followed by a /

5)
([^/]+)/
followed again by any letters until it find a /
the found letters will be memorized and returned latter because of the ( ) group parenthesis
The ending / is not included in the group parenthesis so it will not be captured, only the name of the domain or sub-domain will be returned

6)
you can use many time the set [^/]+/ or ([^/]+)/ to match any recursive sub-folder, captured or not captured.



this filter will not work on special case where URL contains not a normal URL, like stated above :
http://domain.com/redirect.php?page=http://test.com/page.html
the return sub-folder would be redirect.php?page=http: even is it's not a folder, only because the filter looks for the /
Post #3
Edit
Cyan
Avatar
Feb 15 2010, 11:49 pm
Sorry, it seems this filter is not working, because ASF isn't reading the full URL but only the domain present in the URL.
I will add an option in a next release to choose which one the user want to use (full URL or only the domain)
Post #4
Edit
cloudkicker
Avatar
Apr 2 2010, 2:59 am
Just wanted to bump this feature request as one often wants to save files into folders based on capture groups extracted from the full url path of the file.

The easiest way to do this would be to have an option to toggle the domain field to "domain only" (that would filter the domain up to the first "/" after the TLD) or to "domain + path" (that would include everything up to the last "/", but exclude the filename).

The other option (that might be unnecessarily complex) is to add a third field (after domain) that says "path". And that would include everything after the first "/" and before the last "/". I think the first option to toggle "domain" to either "domain only" or "full path" is best.

Really appreciate how responsive you are to user feedback. I was just taking a peek at the code and love seeing ASF get increasingly powerful and intuitive as you get more comfortable with coding for FireFox. Keep up the awesome work :D
Post #5
Edit
Cyan
Avatar
Apr 7 2010, 12:13 am
Thank you for your encouragement:)
I'm not very active on development lately.
I'm still learning and improving when I can. It's not always easy to learn by yourself.
I need to experiment a lot and analyze other add-ons :o

If you know how to code for firefox too (mostly in javascript), let me know if you see something I can fix or improve.


I prefer the idea to toggle "domain only" or "full path" (with or without the page/downloaded file name ? I'll see what is best)

Post #6
Edit
Cyan
Avatar
Aug 12 2010, 6:40 pm
I just added (rev86) the possibility to define which domain type ASF will use to verify the filters.

Now ASF can check :
1- File's domain
2- File's URL
3- File's URL + Filename
4- Tab's domain (or the file's domain if no data is found)
5- Tab's URL (or the file's URL if no data is found)

And they can be combined to be checked in whichever order you want :)
Go to ASF preferences/page 2/while filtering : select the order, coma separated type number.
1,2 would be great for you : If the domain only doesn't match the filter, ASF will check the full file's URL.

Now, to answer your first question about filtering subdomains :

Filters Domain
All    
= Regexp.
File name
All    
= Regexp.
Local folder


^ = start of domain string, followed by
(?:.*?://)? = The protocol, optional (The ?: = non capturing parenthesis group, so $1d will not be this one)
[^/]+/ = the domain
([^/]+)/ = each subdomain, referenced as $1d, $2d etc.
Post #7
Edit
Soforias
Avatar
Sep 25 2011, 6:10 pm
Is this available in the Version 1.0.2? Because it seems that I cannot get it to work.

^(?:.*?://)?[^/]+/([^/]+) does this specify any domain as a dynamic filter or do I still have to add more domain information?

Please provide an example of a domain/ sub domain using this configuration.

Also is there a way to take the first part of a file name as the download path as to dynamically create paths for everything?



Oh and I would like to make a donation.


Great add on! Thanks for your time!

Post #8
Edit
Soforias
Avatar
Sep 25 2011, 6:35 pm
Also Just wondering using Flashgot all function if file paths can be rendered as it seems to skip the ASF filter.
Post #9
Edit
Cyan
Avatar
Sep 27 2011, 4:59 pm
Hi,

Yes this is available in 1.0.2
Did you correctly set the filter type to 1,2 on options page2?
You can also use "3" instead of "2" if you want the filename included in string used to check the filter.

As for an example, I uploaded 4 files you can test with.

Imagine you want to download files located here:
http://asf.mangaheart.org/filters/test_subdomain/image_test_01.zip
http://asf.mangaheart.org/filters/test_subdomain/image_test_02.zip
http://asf.mangaheart.org/filters/test_subdomain/text_test_01.zip
http://asf.mangaheart.org/filters/test_subdomain/text_test_02.zip
(The links are real if you want to try yourself)


You also asked how to use the "first part of the filename" as a filter element for the saved path. The filename are using this form : aaaaaaa_bbbbbb_cc.zip
I'll extract aaaaaaa part from the filename by placing parentheses around the part I want.
(.*)_.*_.*$
aaaa_bbbbb_ccccc.ddd

Note that I'm using the underscore as position finder, but you can also use a specific number of letters from the beginning of the word, check the Regexp help button when creating a filter, you can specify the number of letter like this:
.{10} = 10 characters.
.* = maximum number of characters it can find (0 to infinite), the * is greedy.
so you can set (.{10}).*$ to capture the first 10 character of the filename.
^(?:.*?://)?[^/]+/([^/]+)/([^/]+)/(.{10}).*$

Step 1. Go to option > page 2 > set filter search order : 1,3
3 = use the full URL with the filename as the base for string to use to search the domain.


Step 2. Create the filter :
Filters Domain
All    
= Regexp.
File name
All    
= Regexp.
Local folder


Step 3. Verify that you correctly ticked the "Regexp" checkbox for the domain input field, or else the filter will not work.

Step 4. Copy one of the link above, and paste it into a new tab.
It will download into :

D:\Download\test_subdomain\aaaaaaa\


You can open the console while downloading to see what is done by ASF.
Press Ctrl+Shift+J



Here is what is shown on the console:
Automatic Save Folder :
These data will be used to verify the filters :
Filename: image_test_01.zip
Domain test order: 1,3
1 - File's domain: http://asf.mangaheart.org
2 - File's URL: http://asf.mangaheart.org/filters/test_subdomain/
3 - Full file's URL: http://asf.mangaheart.org/filters/test_subdomain/image_test_01.zip

Automatic Save Folder :
Filter 1 matched domain type : 3

Automatic Save Folder :
Filter 1 is matching both domain and filename.
Domain: ^(?:.*?://)?[^/]+/([^/]+)/([^/]+)/(.*)_.*_.*$
Filename: *
Folder: D:\download\$2d\$3d\

$1d = filters (the first subdomain of the URL)
$2d = test_subdomain (the second subdomain of the URL)
$3d = aaaaaaaa (the first part of the filename)




If you want me to create a filter for you, you can create a new thread in he Filter requesting forum. Let me know the full URL of the file you want to download, if possible at least 2 or 3 different files to see which common parts they are sharing.



===================
About Flashgot, I still didn't try that add-on/any compatible download program. I don't know how the path/file url is passed to the external program.
It's currently not working with it, but it may be added later. I'll have to check how it's working.

===================
I appreciate that you want to make a donation, but I don't have any donation account.
Also, I'm not working on it a lot, it's been one year since I didn't update it, I think I don't deserve it for the time I spend on this add-on ;)

Post #10
Edit
Pages : [1] 2 Add a reply

Return to top