URL Design and periods

classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

URL Design and periods

Greg Hemphill
I'm expanding my use of URL design to cover some URLs that are  
generated from recipes titles in a database.  I've encountered a  
problem that involves Apache thinking any periods in the text should  
resolve to a file vs. being passed off to my Lasso URL design stuff.

For example this URL doesn't work notice the (B.L.A.):
http://dev6.webstop.com/Recipes/Detail/6128/B.L.A._(Bacon%2c_Lettuce%2c_Apple)_Bites/

But this one does:
http://dev6.webstop.com/Recipes/Detail/6128/BLA_(Bacon%2c_Lettuce%2c_Apple)_Bites/

I thought I had found a good solution with Jason's urlpath tags:
http://tagswap.net/encode_urlpath/
http://tagswap.net/decode_urlpath/

These tags take a period and encode it to "& # 4 6 ;" (added spaces  
between each characer to make sure your mail client doesn't display a  
period). However when I use the tags I still get periods in the URL  
bar after clicking the link (even though view source shows the  
encoding characters instead of the period).  So the tags do in fact  
encode the periods, but my browser is "smart" and replaces the encoded  
characters with actual periods in the URL bar.  (tested in Safari 3.x  
and Firefox 2.x).

So I'm wondering if I'm using Jason's tags wrong, or if there are  
other solutions I should be looking at.

I've considered just converting the periods to something some other  
character (maybe a dash), however I'm thinking that will goof things  
when a recipe title actually contains that character... so I'm just  
wondering what ideas others have or what solutions are being employed.

A related question... I know some of you are converting spaces to  
dashes or underscores, but I'm wondering how you handle an actual dash  
or underscore that isn't meant to be a space.

Thanks,
Greg



--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Marc Pinnell-3
Greg,

I ran into the same problem and modified Jason's tag thusly:

                define_tag(
                        'urlpath',
                        -namespace='encode_',
                        -required='in',
                        -priority='replace',
                        -description='Makes a string safe to use as a URL path component  
with Apache.'
                );
                        local('out') = #in;
                        #out->replace(' ','_')&replace('/','-!')&replace('\'','`');
                        #out = encode_stricturl(#out);
                        #out->replace('.','~');
                        #out->replace('!','%21');
                        #out->replace('ñ','ñ');
                        #out->replace('%5c','\');
                        return(@#out);
                /define_tag;



Marc

On Oct 6, 2008, at 11:37 AM, Greg Hemphill wrote:

> I'm expanding my use of URL design to cover some URLs that are  
> generated from recipes titles in a database.  I've encountered a  
> problem that involves Apache thinking any periods in the text should  
> resolve to a file vs. being passed off to my Lasso URL design stuff.
>
> For example this URL doesn't work notice the (B.L.A.):
> http://dev6.webstop.com/Recipes/Detail/6128/B.L.A._(Bacon%2c_Lettuce%2c_Apple)_Bites/
>
> But this one does:
> http://dev6.webstop.com/Recipes/Detail/6128/BLA_(Bacon%2c_Lettuce%2c_Apple)_Bites/
>
> I thought I had found a good solution with Jason's urlpath tags:
> http://tagswap.net/encode_urlpath/
> http://tagswap.net/decode_urlpath/
>
> These tags take a period and encode it to "& # 4 6 ;" (added spaces  
> between each characer to make sure your mail client doesn't display  
> a period). However when I use the tags I still get periods in the  
> URL bar after clicking the link (even though view source shows the  
> encoding characters instead of the period).  So the tags do in fact  
> encode the periods, but my browser is "smart" and replaces the  
> encoded characters with actual periods in the URL bar.  (tested in  
> Safari 3.x and Firefox 2.x).
>
> So I'm wondering if I'm using Jason's tags wrong, or if there are  
> other solutions I should be looking at.
>
> I've considered just converting the periods to something some other  
> character (maybe a dash), however I'm thinking that will goof things  
> when a recipe title actually contains that character... so I'm just  
> wondering what ideas others have or what solutions are being employed.
>
> A related question... I know some of you are converting spaces to  
> dashes or underscores, but I'm wondering how you handle an actual  
> dash or underscore that isn't meant to be a space.
>
> Thanks,
> Greg
>
>
>
> --
> This list is a free service of LassoSoft: http://www.LassoSoft.com/
> Search the list archives: http://www.ListSearch.com/Lasso/Browse/
> Manage your subscription: http://www.ListSearch.com/Lasso/
>

--
Marc Pinnell
1027 Design
PO Box 990872
Redding, CA 96099-0872
530.941.4706
fax: 866.232.5300
www.1027Design.com




--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

jasonhuck
How are you telling Apache which URL's to pass off?

I use a set of LocationMatch rules which allows periods anywhere
except the last few characters:


# Rules For [define_atbegin] Processing
# -------------------------------------
# Set Lasso to handle:
# Any URL that ends with a period, with or without a trailing slash.
<LocationMatch "^.+\./?$">
        SetHandler lasso8-handler
</LocationMatch>

# Any URL that contains periods, where the last period is
# at least 5 characters, and no more than 240 characters,
# from the end.
<LocationMatch "^.+\.[^\.]{5,240}$">
        SetHandler lasso8-handler
</LocationMatch>

# Any URL that does not contain a period at all.
<LocationMatch "^[^\.]+$">
        SetHandler lasso8-handler
</LocationMatch>


There are some alternatives to this floating around if you check the archives.

- jason


--
tagSwap.net :: Open Source Lasso Code
<http://tagSwap.net/>

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Greg Hemphill
In reply to this post by Marc Pinnell-3
That seems to work fine... thanks!

Now in implement the no ~ in recipe titles policy :)

On Oct 6, 2008, at 2:53 PM, Marc Pinnell wrote:

> Greg,
>
> I ran into the same problem and modified Jason's tag thusly:
>
> #out->replace('.','~');
This line did the trick


>
>
>
> Marc
>
> On Oct 6, 2008, at 11:37 AM, Greg Hemphill wrote:
>
>> I'm expanding my use of URL design to cover some URLs that are  
>> generated from recipes titles in a database.  I've encountered a  
>> problem that involves Apache thinking any periods in the text  
>> should resolve to a file vs. being passed off to my Lasso URL  
>> design stuff.
>>
>> For example this URL doesn't work notice the (B.L.A.):
>> http://dev6.webstop.com/Recipes/Detail/6128/B.L.A._(Bacon%2c_Lettuce%2c_Apple)_Bites/
>>
>> But this one does:
>> http://dev6.webstop.com/Recipes/Detail/6128/BLA_(Bacon%2c_Lettuce%2c_Apple)_Bites/
>>
>> I thought I had found a good solution with Jason's urlpath tags:
>> http://tagswap.net/encode_urlpath/
>> http://tagswap.net/decode_urlpath/
>>
>> These tags take a period and encode it to "& # 4 6 ;" (added spaces  
>> between each characer to make sure your mail client doesn't display  
>> a period). However when I use the tags I still get periods in the  
>> URL bar after clicking the link (even though view source shows the  
>> encoding characters instead of the period).  So the tags do in fact  
>> encode the periods, but my browser is "smart" and replaces the  
>> encoded characters with actual periods in the URL bar.  (tested in  
>> Safari 3.x and Firefox 2.x).
>>
>> So I'm wondering if I'm using Jason's tags wrong, or if there are  
>> other solutions I should be looking at.
>>
>> I've considered just converting the periods to something some other  
>> character (maybe a dash), however I'm thinking that will goof  
>> things when a recipe title actually contains that character... so  
>> I'm just wondering what ideas others have or what solutions are  
>> being employed.
>>
>> A related question... I know some of you are converting spaces to  
>> dashes or underscores, but I'm wondering how you handle an actual  
>> dash or underscore that isn't meant to be a space.
>>
>> Thanks,
>> Greg
>>
>>
>>
>> --
>> This list is a free service of LassoSoft: http://www.LassoSoft.com/
>> Search the list archives: http://www.ListSearch.com/Lasso/Browse/
>> Manage your subscription: http://www.ListSearch.com/Lasso/
>>
>
> --
> Marc Pinnell
> 1027 Design
> PO Box 990872
> Redding, CA 96099-0872
> 530.941.4706
> fax: 866.232.5300
> www.1027Design.com
>
>
>
>
> --
> This list is a free service of LassoSoft: http://www.LassoSoft.com/
> Search the list archives: http://www.ListSearch.com/Lasso/Browse/
> Manage your subscription: http://www.ListSearch.com/Lasso/
>


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Greg Hemphill
In reply to this post by jasonhuck
I've seen this example before, but never quite got why people were  
putting it in there before now... thinking i'll add it. Although I  
think it will still fall apart if a period is found to be less than 5  
characters from the end of the URL, but this at least reduces the  
chances problems will occur.

Thanks,
G

On Oct 6, 2008, at 3:02 PM, Jason Huck wrote:

> # Any URL that contains periods, where the last period is
> # at least 5 characters, and no more than 240 characters,
> # from the end.
> <LocationMatch "^.+\.[^\.]{5,240}$">
> SetHandler lasso8-handler
> </LocationMatch>

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Bil Corry-3
In reply to this post by Greg Hemphill
Greg Hemphill wrote on 10/6/2008 1:37 PM:
> I'm expanding my use of URL design to cover some URLs that are generated
> from recipes titles in a database.  I've encountered a problem that
> involves Apache thinking any periods in the text should resolve to a
> file vs. being passed off to my Lasso URL design stuff.

That's a function of the Apache directive you're using to pass virtual paths to Lasso.  Most likely you're using the most common one, which specifies the rule as being "pass any URL that doesn't contain periods to Lasso" -- looks something like this:

        <LocationMatch "^[^\.]+$">
                SetHandler lasso8-handler
        </LocationMatch>

Jason already posted his solution to this thread, which is to specify a variety of rules that allow periods in certain cases.  I actually took his rules and combined them into one back in May:

    <LocationMatch "(^|/)([^.]*|([^.]+\.)+[^.]{5,}|.*\.)$">
        SetHandler lasso8-handler
    </LocationMatch>

But since then I've changed my approach to instead send any non-existent URLs to Lasso.  The advantage is anything that would normally 404 is now sent to Lasso and from there you can decide if it's something Lasso should process (as part of your webapp) or decide it's not and return 404.  And the 404 you return can be customized to maybe return a list of possible pages that the person might try.  The disadvantage is anything that would normally 404 is now sent to Lasso and it means the load on Lasso has increased.

If you're interested, here's the directives for Apache to implement it (choose your Apache version):


Apache 2.2
----------

   <VirtualHost *:80>

   # your vhost stuff here

   # Lasso gets all virtual files and directories
   # Apache 2.2
   RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
   RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
   RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
   RewriteRule  (.*)  -  [E=XREQUEST:$1]
   RequestHeader set X-Request %{XREQUEST}e env=XREQUEST
   RewriteCond %{ENV:XREQUEST} .+
   RewriteRule (.*) - [H=lasso8-handler,PT]

   </VirtualHost>



Apache 2.0
----------

   <VirtualHost *:80>

   # your vhost stuff here

   # Lasso gets all virtual files and directories
   # Apache 2.0
   RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
   RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
   RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
   RewriteRule  (.*)  -  [E=XREQUEST:$1]
   RequestHeader set X-Request %{XREQUEST}e env=XREQUEST
   RewriteCond %{ENV:XREQUEST} .+
   RewriteRule (.*) /processvirtualurl
   <Files processvirtualurl>
     SetHandler lasso8-handler
     RewriteRule (.*)  %{ENV:XREQUEST}  [L]
   </Files>

   </VirtualHost>


You'll also need to enable mod_rewrite and mod_headers.


Here's a simple atBegin handler to go with it -- all virtual URLs will display a page with the headers, params (you'll need lp_client_params) and response_filepath:

    [

    define_atBegin({

        if: client_headers->contains('X-REQUEST'); // virtual path?

            $__html_reply__ = '<pre>' + encode_html(client_headers) + '\r\r' + encode_html(lp_client_params->join('\r')) + '\r\r' + encode_html(response_filepath) + '</pre>';
            abort;
   
        /if;
    });

    ]

Obviously you'll need to tweak the handler depending on your app.  The thing to take away is that the presence of the client header X-REQUEST tells you if the URL is virtual or if the path is real.



> A related question... I know some of you are converting spaces to dashes
> or underscores, but I'm wondering how you handle an actual dash or
> underscore that isn't meant to be a space.

I get the feeling here that you're trying to create a reversible slug -- e.g. the title of the recipe is converted into a slug, which then is put into the URL, Lasso then extracts the slug from the URL, reverses the encoding on the slug to get the recipe title (and allows the recipe lookup via the title).

I'd suggest instead creating a hashed slug -- e.g. the title of the recipe is transformed into a unique slug that is stored along with the recipe that then allows the slug to be used in the URL and makes finding the appropriate recipe easy based on the slug -- no more reversing!

The big advantage to a hashed slug is that you can then SEO the slug by transforming it using a SEO-friendly format.  This means making the slug all lowercase, separating words with hyphens (not underscores), removing punctuation and remove any characters that will have to be URL-encoded (e.g. %2C) or transform them from high-ascii to plain-ascii equivalents.  The goal here is to make the slug very readable with just the keywords, not necessarily duplicate the original title word for word.  I've also read that you should drop common words, but it depends on how much effort you want to put into it.  I'll probably tackle this issue with a ctag once I get to setting up LassoBlogger.


- Bil


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Greg Hemphill
In reply to this post by Greg Hemphill
For those following this thread, here is what I came up with...  
Jason's tags modified with Marc's suggestions and converts underscores  
to -* so underscores won't get decoded to spaces.

define_tag(
                'urlpath',
                -namespace='encode_',
                -required='in',
                -priority='replace',
                -description='Makes a string safe to use as a URL path component  
with Apache.'
        );
                local('out') = #in;
                #out->replace('_','-*')&replace('  
','_')&replace('/','-!')&replace('\'','`');
                #out = encode_stricturl(#out);
                #out-
 >
replace
('.','~
')&replace('%5c','&#92;')&replace('!','%21')&replace('ñ','&#241;');
                return(@#out);
        /define_tag;
       
        define_tag(
                'urlpath',
                -namespace='decode_',
                -required='in',
                -priority='replace',
                -description='Decodes a string encoded by [encode_urlpath].'
        );
                local('out') = #in;
                #out->replace('_',' ')&replace('-
*','_
')&replace
('&#92
;','%5c
')&replace
('&#46
;','.')&replace('~','.')&replace('%21','!',)&replace('&#241;','ñ');
                #out = decode_url(#out);
                #out->replace('-!','/')&replace('`','\'');
                return(@#out);
        /define_tag;


On Oct 6, 2008, at 3:44 PM, Greg Hemphill wrote:

> That seems to work fine... thanks!
>
> Now in implement the no ~ in recipe titles policy :)
>
> On Oct 6, 2008, at 2:53 PM, Marc Pinnell wrote:
>
>> Greg,
>>
>> I ran into the same problem and modified Jason's tag thusly:
>>
>> #out->replace('.','~');
> This line did the trick
>
>
>>
>>
>>
>> Marc
>>
>> On Oct 6, 2008, at 11:37 AM, Greg Hemphill wrote:
>>
>>> I'm expanding my use of URL design to cover some URLs that are  
>>> generated from recipes titles in a database.  I've encountered a  
>>> problem that involves Apache thinking any periods in the text  
>>> should resolve to a file vs. being passed off to my Lasso URL  
>>> design stuff.
>>>
>>> For example this URL doesn't work notice the (B.L.A.):
>>> http://dev6.webstop.com/Recipes/Detail/6128/B.L.A._(Bacon%2c_Lettuce%2c_Apple)_Bites/
>>>
>>> But this one does:
>>> http://dev6.webstop.com/Recipes/Detail/6128/BLA_(Bacon%2c_Lettuce%2c_Apple)_Bites/
>>>
>>> I thought I had found a good solution with Jason's urlpath tags:
>>> http://tagswap.net/encode_urlpath/
>>> http://tagswap.net/decode_urlpath/
>>>
>>> These tags take a period and encode it to "& # 4 6 ;" (added  
>>> spaces between each characer to make sure your mail client doesn't  
>>> display a period). However when I use the tags I still get periods  
>>> in the URL bar after clicking the link (even though view source  
>>> shows the encoding characters instead of the period).  So the tags  
>>> do in fact encode the periods, but my browser is "smart" and  
>>> replaces the encoded characters with actual periods in the URL  
>>> bar.  (tested in Safari 3.x and Firefox 2.x).
>>>
>>> So I'm wondering if I'm using Jason's tags wrong, or if there are  
>>> other solutions I should be looking at.
>>>
>>> I've considered just converting the periods to something some  
>>> other character (maybe a dash), however I'm thinking that will  
>>> goof things when a recipe title actually contains that  
>>> character... so I'm just wondering what ideas others have or what  
>>> solutions are being employed.
>>>
>>> A related question... I know some of you are converting spaces to  
>>> dashes or underscores, but I'm wondering how you handle an actual  
>>> dash or underscore that isn't meant to be a space.
>>>
>>> Thanks,
>>> Greg
>>>
>>>
>>>
>>> --
>>> This list is a free service of LassoSoft: http://www.LassoSoft.com/
>>> Search the list archives: http://www.ListSearch.com/Lasso/Browse/
>>> Manage your subscription: http://www.ListSearch.com/Lasso/
>>>
>>
>> --
>> Marc Pinnell
>> 1027 Design
>> PO Box 990872
>> Redding, CA 96099-0872
>> 530.941.4706
>> fax: 866.232.5300
>> www.1027Design.com
>>
>>
>>
>>
>> --
>> This list is a free service of LassoSoft: http://www.LassoSoft.com/
>> Search the list archives: http://www.ListSearch.com/Lasso/Browse/
>> Manage your subscription: http://www.ListSearch.com/Lasso/
>>
>
>
> --
> This list is a free service of LassoSoft: http://www.LassoSoft.com/
> Search the list archives: http://www.ListSearch.com/Lasso/Browse/
> Manage your subscription: http://www.ListSearch.com/Lasso/
>


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Johan Solve
In reply to this post by Bil Corry-3
On Mon, Oct 6, 2008 at 10:49 PM, Bil Corry <[hidden email]> wrote:
> But since then I've changed my approach to instead send any non-existent URLs to Lasso.

That's a very clever approach, and as a bonus it also makes it easier
to handle things like protected downloads and controlled image serving
(using file_serve or file_Stream) while still using perfectly sane
URLs.
This might be the holy grail of URL design.


--
Mvh
Johan Sölve
____________________________________
Montania System AB
Halmstad   Stockholm   Malmö
http://www.montania.se

Johan Sölve
Mobil +46 709-51 55 70
[hidden email]

Kristinebergsvägen 17, S-302 41 Halmstad, Sweden
Telefon +46 35-136800 |  Fax +46 35-136801

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

stevepiercy
In reply to this post by Bil Corry-3
On Monday, October 6, 2008, [hidden email] (Bil Corry) pronounced:

>   RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$

What is this in English?

    "If the URI does not match (some grouping I don't understand,
    perhaps case-insensitive modifier?) followed by a new line
    (beginning of URI), then any character repeated any times, and
    ending with '.LassoApp'"

--steve

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Steve Piercy               Web Site Builder               Soquel, CA
<[hidden email]>                  <http://www.StevePiercy.com/>

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Johan Solve
On Thu, Oct 9, 2008 at 12:02 PM, Steve Piercy - Web Site Builder
<[hidden email]> wrote:

> On Monday, October 6, 2008, [hidden email] (Bil Corry) pronounced:
>
>>   RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>
> What is this in English?
>
>    "If the URI does not match (some grouping I don't understand,
>    perhaps case-insensitive modifier?) followed by a new line
>    (beginning of URI), then any character repeated any times, and
>    ending with '.LassoApp'"

(?i) is case insensitive yes.
^ is not new line, it's just "the beginning", usually the beginning of
a line. The rest of your assumptions are (also) correct, "ending with"
being enforced by the "$".

--
Mvh
Johan Sölve
____________________________________
Montania System AB
Halmstad   Stockholm   Malmö
http://www.montania.se

Johan Sölve
Mobil +46 709-51 55 70
[hidden email]

Kristinebergsvägen 17, S-302 41 Halmstad, Sweden
Telefon +46 35-136800 |  Fax +46 35-136801

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Bil Corry-3
Johan Solve wrote on 10/9/2008 5:51 AM:

> On Thu, Oct 9, 2008 at 12:02 PM, Steve Piercy - Web Site Builder
> <[hidden email]> wrote:
>> On Monday, October 6, 2008, [hidden email] (Bil Corry) pronounced:
>>
>>>   RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>> What is this in English?
>>
>>    "If the URI does not match (some grouping I don't understand,
>>    perhaps case-insensitive modifier?) followed by a new line
>>    (beginning of URI), then any character repeated any times, and
>>    ending with '.LassoApp'"
>
> (?i) is case insensitive yes.
> ^ is not new line, it's just "the beginning", usually the beginning of
> a line. The rest of your assumptions are (also) correct, "ending with"
> being enforced by the "$".
>

That's it.

The logic behind it is it excludes LassoApps from being found as virtual, since all the built-in LassoApps are virtual and if your atBegin handler catches them, then it screws up ServerAdmin.LassoApp, SiteAdmin.LassoApp, etc.


- Bil



--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Bil Corry-3
Bil Corry wrote on 10/9/2008 7:23 AM:

> Johan Solve wrote on 10/9/2008 5:51 AM:
>> On Thu, Oct 9, 2008 at 12:02 PM, Steve Piercy - Web Site Builder
>> <[hidden email]> wrote:
>>> On Monday, October 6, 2008, [hidden email] (Bil Corry) pronounced:
>>>
>>>>   RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>>> What is this in English?
>>>
>>>    "If the URI does not match (some grouping I don't understand,
>>>    perhaps case-insensitive modifier?) followed by a new line
>>>    (beginning of URI), then any character repeated any times, and
>>>    ending with '.LassoApp'"
>> (?i) is case insensitive yes.
>> ^ is not new line, it's just "the beginning", usually the beginning of
>> a line. The rest of your assumptions are (also) correct, "ending with"
>> being enforced by the "$".
>>
>
> That's it.
>
> The logic behind it is it excludes LassoApps from being found as virtual, since all the built-in LassoApps are virtual and if your atBegin handler catches them, then it screws up ServerAdmin.LassoApp, SiteAdmin.LassoApp, etc.
>

And BTW, you can add more exclusions.  What I've provided is the maximum allowable virtual URLs -- pretty much anything that doesn't physically map to a file on the disk (the exception being LassoApps).  But if you know you will never have virtual JavaScript files, or images, CSS, etc, you can exclude them and let Apache handle the 404:

        # Exclude any file with the extension .LassoApp, .GIF, .JPG, .JS, .CSS
        RewriteCond %{REQUEST_URI}  !(?i)^.*\.(LassoApp|GIF|JPG|JS|CSS)$


- Bil



--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

stevepiercy
In reply to this post by Johan Solve
On Thursday, October 9, 2008, [hidden email] (Johan Solve) pronounced:

>On Thu, Oct 9, 2008 at 12:02 PM, Steve Piercy - Web Site Builder
><[hidden email]> wrote:
>> On Monday, October 6, 2008, [hidden email] (Bil Corry) pronounced:
>>
>>>   RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>>
>> What is this in English?
>>
>>    "If the URI does not match (some grouping I don't understand,
>>    perhaps case-insensitive modifier?) followed by a new line
>>    (beginning of URI), then any character repeated any times, and
>>    ending with '.LassoApp'"
>
>(?i) is case insensitive yes.
>^ is not new line, it's just "the beginning", usually the beginning of
>a line. The rest of your assumptions are (also) correct, "ending with"
>being enforced by the "$".

(?i) is what was throwing me.  I looked all over the Apache docs for mod_rewrite for this string, but found nothing.  Is there a reference that contains a list of similar modifiers?

--steve

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Steve Piercy               Web Site Builder               Soquel, CA
<[hidden email]>                  <http://www.StevePiercy.com/>

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Bil Corry-3
Steve Piercy - Web Site Builder wrote on 10/9/2008 1:24 PM:
> (?i) is what was throwing me.  I looked all over the Apache docs for mod_rewrite for this string, but found nothing.  Is there a reference that contains a list of similar modifiers?

I go here first:

        http://www.regular-expressions.info/

That modifier and others are specifically listed here:

        http://www.regular-expressions.info/refadv.html

And if I need something really tricky, I read through the PCRE man pages (Lasso uses PCRE for it's regex engine):

        http://pcre.org/pcre.txt

Some day I'll build a parser to convert those man pages into something more palatable...


- Bil


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Bil Corry-3
Bil Corry wrote on 10/9/2008 1:29 PM:
> Steve Piercy - Web Site Builder wrote on 10/9/2008 1:24 PM:
>> (?i) is what was throwing me.  I looked all over the Apache docs for mod_rewrite for this string, but found nothing.  Is there a reference that contains a list of similar modifiers?
>
> That modifier and others are specifically listed here:
>
> http://www.regular-expressions.info/refadv.html

Besides (i) for case-insensitivity, the other two that are the most helpful are the "dot match newlines" (?s) and the "multi-line" (?m).  Here's an example of each:

        var('text') = 'a\r\nb\r\nc\r\nabcd\r\nabc efg';

        string_findregexp($text, -find='a.+c');'<br>';
        string_findregexp($text, -find='(?s)a.+c');'<br>';

        string_findregexp($text, -find='^abc.*$');'<br>';
        string_findregexp($text, -find='(?m)^abc.*$');'<br>';


Returns:

        array: (abc), (abc)
        array: (a b c abcd abc)
        array
        array: (abcd ), (abc efg)


- Bil


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Bil Corry-3
In reply to this post by Bil Corry-3
(be sure to read my message below if you use mod_rewrite with Lasso)

Bil Corry wrote on 10/6/2008 3:49 PM:

> Apache 2.0
> ----------
>
>    <VirtualHost *:80>
>
>    # your vhost stuff here
>
>    # Lasso gets all virtual files and directories
>    # Apache 2.0
>    RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
>    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
>    RewriteRule  (.*)  -  [E=XREQUEST:$1]
>    RequestHeader set X-Request %{XREQUEST}e env=XREQUEST
>    RewriteCond %{ENV:XREQUEST} .+
>    RewriteRule (.*) /processvirtualurl
>    <Files processvirtualurl>
>      SetHandler lasso8-handler
>      RewriteRule (.*)  %{ENV:XREQUEST}  [L]
>    </Files>
>
>    </VirtualHost>

I did a lot of testing today and have fixed/tweaked a few things with my Apache 2.0 rules:

<VirtualHost *:80>

    ### your vhost stuff here ###

    RewriteEngine On

    # Lasso gets all virtual files and directories
    # Apache 2.0 (and older?)
    RewriteCond %{IS_SUBREQ}  false
    RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d [OR]
    RewriteCond %{REQUEST_URI}  ^/$
    RewriteRule  (.*)  /processvirtualurl  [L,E=XREQUEST:$1]
    RequestHeader set X-Request %{XREQUEST}e env=XREQUEST

    <Files processvirtualurl>
        SetHandler lasso8-handler
    </Files>

</VirtualHost>


The most important thing I discovered is that no matter if you use my rules or your own, be sure to add the following conditional before your RewriteRule:

    RewriteCond %{IS_SUBREQ}  false

Lasso will perform a slew of sub-requests to figure out where files are located (for includes etc), and unless you exclude the sub-requests, it means Lasso's sub-requests will get processed by your mod_rewrite just like browser requests.  This may be why file_exists returns true for non-existent files; your mod_rewrite rules are tricking Lasso.  Excluding sub-requests from being processed by the mod_rewrite rules may fix the issue.  Some background and another less-than-ideal solution is here (maybe this isn't valid anymore?):

        http://www.lassosoft.com/Support/Notes/index.lasso?7558


- Bil


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

stevepiercy
Very nice work.

So for example, if one did the following.

    include('/oops/wrong/filepath/header.inc');

one would see the result of however Apache would process and handoff the request to Lasso, wherever the include appears in the page?

But if one's code is perfect...

    include('/this/time/fershure/header.inc');

...one would not observe any problems?

--steve


On Sunday, October 12, 2008, [hidden email] (Bil Corry) pronounced:

>(be sure to read my message below if you use mod_rewrite with Lasso)
>
>Bil Corry wrote on 10/6/2008 3:49 PM:
>> Apache 2.0
>> ----------
>>
>>    <VirtualHost *:80>
>>
>>    # your vhost stuff here
>>
>>    # Lasso gets all virtual files and directories
>>    # Apache 2.0
>>    RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>>    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
>>    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
>>    RewriteRule  (.*)  -  [E=XREQUEST:$1]
>>    RequestHeader set X-Request %{XREQUEST}e env=XREQUEST
>>    RewriteCond %{ENV:XREQUEST} .+
>>    RewriteRule (.*) /processvirtualurl
>>    <Files processvirtualurl>
>>      SetHandler lasso8-handler
>>      RewriteRule (.*)  %{ENV:XREQUEST}  [L]
>>    </Files>
>>
>>    </VirtualHost>
>
>I did a lot of testing today and have fixed/tweaked a few things with my Apache 2.0
>rules:
>
><VirtualHost *:80>
>
>    ### your vhost stuff here ###
>
>    RewriteEngine On
>
>    # Lasso gets all virtual files and directories
>    # Apache 2.0 (and older?)
>    RewriteCond %{IS_SUBREQ}  false
>    RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
>    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d [OR]
>    RewriteCond %{REQUEST_URI}  ^/$
>    RewriteRule  (.*)  /processvirtualurl  [L,E=XREQUEST:$1]
>    RequestHeader set X-Request %{XREQUEST}e env=XREQUEST
>
>    <Files processvirtualurl>
>        SetHandler lasso8-handler
>    </Files>
>
></VirtualHost>
>
>
>The most important thing I discovered is that no matter if you use my rules or your
>own, be sure to add the following conditional before your RewriteRule:
>
>    RewriteCond %{IS_SUBREQ}  false
>
>Lasso will perform a slew of sub-requests to figure out where files are located (for
>includes etc), and unless you exclude the sub-requests, it means Lasso's
>sub-requests will get processed by your mod_rewrite just like browser requests.  
>This may be why file_exists returns true for non-existent files; your mod_rewrite
>rules are tricking Lasso.  Excluding sub-requests from being processed by the
>mod_rewrite rules may fix the issue.  Some background and another less-than-ideal
>solution is here (maybe this isn't valid anymore?):
>
>   http://www.lassosoft.com/Support/Notes/index.lasso?7558
>
>
>- Bil
>
>

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Steve Piercy               Web Site Builder               Soquel, CA
<[hidden email]>                  <http://www.StevePiercy.com/>

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Bil Corry-3
Steve Piercy - Web Site Builder wrote on 10/12/2008 4:28 AM:
> Very nice work.
>
> So for example, if one did the following.
>
>     include('/oops/wrong/filepath/header.inc');

Lasso would then make a few sub-requests.  I didn't study it very carefully, but it appeared Lasso would make the following requests for that page:

        /LassoProcessTag
        /
        /oops/wrong/filepath/
        /opps/wrong/filepath/header.inc


I think that's how Lasso gets it's bearings for the various file paths required for the page.  And Apache will also make sub-requests for your index files, so if the request if for a directory, you'll see Apache making subrequests for:

        /path/index.lasso
        /path/index.html
        /path/index.htm
        ... etc...

For as many index files as you've specified.

Anyhow, if you're interested, add the following debug commands just before your rewrite rules:

        RewriteEngine On

        # debug rewrite
        RewriteLog /path/to/rewrite.log
        RewriteLogLevel 9

        ### RULES HERE ###


That'll output what Apache is doing with your rewrite commands.  Incredibly helpful for debugging why a rule is not working properly.  Do NOT do this on a production matchine though!  RewriteLogLevel 9 is the most detailed, hitting a single Lasso page will output a sizable chunk of logging.  In a production environment, the log would quickly grow to consume your entire drive.

So on your dev box, hit a single page and look at the results both with and without the sub-request filter, you'll see what I mean.

And much thanks to James Harvard who pointed out that there's also the [NS] flag, so instead of this:

    RewriteCond %{IS_SUBREQ}  false

You can just add the [NS] flag to the RewriteRule.  So my rules rewritten are:


<VirtualHost *:80>

    ### your vhost stuff here ###

    RewriteEngine On

    # Lasso gets all virtual files and directories
    # Apache 2.0 (and older?)
    RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d [OR]
    RewriteCond %{REQUEST_URI}  ^/$
    RewriteRule  (.*)  /processvirtualurl  [L,NS,E=XREQUEST:$1]
    RequestHeader set X-Request %{XREQUEST}e env=XREQUEST

    <Files processvirtualurl>
        SetHandler lasso8-handler
    </Files>

</VirtualHost>



- Bil


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Göran Törnquist-2
Bil Corry wrote:

> Steve Piercy - Web Site Builder wrote on 10/12/2008 4:28 AM:
>  
>> Very nice work.
>>
>> So for example, if one did the following.
>>
>>     include('/oops/wrong/filepath/header.inc');
>>    
>
> Lasso would then make a few sub-requests.  I didn't study it very carefully, but it appeared Lasso would make the following requests for that page:
>
> /LassoProcessTag
> /
> /oops/wrong/filepath/
> /opps/wrong/filepath/header.inc
>
>
> I think that's how Lasso gets it's bearings for the various file paths required for the page.  And Apache will also make sub-requests for your index files, so if the request if for a directory, you'll see Apache making subrequests for:
>
> /path/index.lasso
> /path/index.html
> /path/index.htm
> ... etc...
>
> For as many index files as you've specified.
>
> Anyhow, if you're interested, add the following debug commands just before your rewrite rules:
>
> RewriteEngine On
>
> # debug rewrite
> RewriteLog /path/to/rewrite.log
> RewriteLogLevel 9
>
> ### RULES HERE ###
>
>
> That'll output what Apache is doing with your rewrite commands.  Incredibly helpful for debugging why a rule is not working properly.  Do NOT do this on a production matchine though!  RewriteLogLevel 9 is the most detailed, hitting a single Lasso page will output a sizable chunk of logging.  In a production environment, the log would quickly grow to consume your entire drive.
>
> So on your dev box, hit a single page and look at the results both with and without the sub-request filter, you'll see what I mean.
>
> And much thanks to James Harvard who pointed out that there's also the [NS] flag, so instead of this:
>
>     RewriteCond %{IS_SUBREQ}  false
>
> You can just add the [NS] flag to the RewriteRule.  So my rules rewritten are:
>
>
> <VirtualHost *:80>
>
>     ### your vhost stuff here ###
>
>     RewriteEngine On
>
>     # Lasso gets all virtual files and directories
>     # Apache 2.0 (and older?)
>     RewriteCond %{REQUEST_URI}  !(?i)^.*\.LassoApp$
>     RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
>     RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d [OR]
>     RewriteCond %{REQUEST_URI}  ^/$
>     RewriteRule  (.*)  /processvirtualurl  [L,NS,E=XREQUEST:$1]
>     RequestHeader set X-Request %{XREQUEST}e env=XREQUEST
>
>     <Files processvirtualurl>
>         SetHandler lasso8-handler
>     </Files>
>
> </VirtualHost>
>  
Thanks Bil and James for making Apache URI fiddling a little bit easier
to understand.

It almost works on older versions of Apache.

What is changed in this version is mainly two things:
The syntax for checking for LassoApp with no case sensitivity is
slightly different.
There are no directive for setting request headers. Therefore I've
changed the script to instead rewrite the querystring to include the URI.

I've tested this script on Apache 1.3:

<VirtualHost *:80>

    ### your vhost stuff here ###

    RewriteEngine On

    # Lasso gets all virtual files and directories
    # Apache 2.0 (and older?)
    RewriteCond %{REQUEST_URI}  !^.*\.LassoApp$ [NC]
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d [OR]
    RewriteCond %{REQUEST_URI}  ^/$
    RewriteRule  (.*)  /processvirtualurl?vpath=$1  [L,NS,QSA]

    <Files processvirtualurl>
        SetHandler lasso8-handler
    </Files>

</VirtualHost>

/Göran

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

Reply | Threaded
Open this post in threaded view
|

Re: URL Design and periods

Johan Solve
On Mon, Oct 13, 2008 at 1:18 AM, Göran Törnquist <[hidden email]> wrote:
> There are no directive for setting request headers. Therefore I've changed
> the script to instead rewrite the querystring to include the URI.

Can't your atbegin handler just parse the response_filepath, instead
of sending the path as a vpath parameter?


--
Mvh
Johan Sölve
____________________________________
Montania System AB
Halmstad   Stockholm   Malmö
http://www.montania.se

Johan Sölve
Mobil +46 709-51 55 70
[hidden email]

Kristinebergsvägen 17, S-302 41 Halmstad, Sweden
Telefon +46 35-136800 |  Fax +46 35-136801

--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/

123