Interesting speed comparison between large map find and match

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Interesting speed comparison between large map find and match

Jolle Carlestam-2
Here’s something I’ve learned today.

In order to pinpoint what page the user is requesting I have a map built with all legitimate page paths as key and the type that’s associated with that call as value.

It’s been fast and sufficient for my needs. But as times pass and clients request more and more functionality, the map grows.

Today I was pondering if this approach actually is the fastest for fairly large maps. To find out I’ve setup a test to compare it with a match / case construction.

Here’s the code I’ve used for testing:

local(

        timervalue = 1000
)

timer(#timervalue) => {^

        local(
                bigmap = map(
                        '1_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '2_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '3_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '4_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '5_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '6_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '7_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '8_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '9_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
                        '10_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',
// etc up to 500 items
                        '500_Lorem' = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.',


                ),
                output = ''

        )

        loop(200,100) => {
                #output -> append(loop_count + #bigmap -> find(loop_count + '_Lorem'))
        }

        #output
^}

timer(#timervalue) => {^

        local(
                output = ''
        )

        loop(200,100) => {
                match(loop_count + '_Lorem') => {
                        case('1_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('2_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('3_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('4_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('5_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('6_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('7_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('8_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('9_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')
                        case('10_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')

// etc up to 500 items
                        case(’500_Lorem') #output -> append(loop_count + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim iaculis libero consequat sed.')

                }
        }

        #output


^}

My findings indicate that match / case is significantly faster than find in a map.

Find in map:
array((micros = 12935079.000000), (micros_average = 12935.079000))
Match / case
array((micros = 5641225.000000), (micros_average = 5641.225000))


My conclusion is that if the dataset you’re looking into is fixed (as in almost never changes and can be defined in code) it is better from a speed perspective to do match / find.

But, if the dataset is dynamic then it’s probably more convenient to turn it into a map and use that.

HDB
Jolle

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Bil Corry-3
Interesting.  When I created lp_client_browser, I needed a way to have a
large static list of possible values, so I settled on a lookup based on the
first character of the user-agent string, then did the lookup on only those
values.  Probably not the most efficient way, but at the time it seemed
good enough, e.g.:

'q' = (array:
'Qango' = (:'Qango.com Web Directory robot','R'),
'QPCreep' = (:'Quepasa!com (Latin American search) robot','R'),
'QuepasaCreep' = (:'Quepasa!com (Latin American search) robot','R'),
'QueryN' = (:'QueryN Metasearch robot','R'),
'QuickTime' = (:'Quicktime for Macintosh','B'),
'Qweery' = (:'Qweerybot for the Qweery search engine (in development) -
Netherland','R'),
),
'r' = (array:
'rabaz' = (:'gigaBaz - the brainbot (Germany) robot','R'),
'RaBot' = (:'Daum Korea robot (211.115.109.xxx)','R'),
'ramBot' = (:'Intersearch.de (was www.intersearch.de) robot (Germany)','R'),
'RAMPyBot' = (:'RAMPyBot','R'),
'Rank Exec' = (:'Rank Exec reciprocal link checking','C'),
'Rational SiteCheck' = (:'Innova/IBM Rational SiteCheck - Rational
robot','R'),
'ReadABlog' = (:'Read A Blog - RSS feed and blog search engine','C'),
'RealDownload' = (:'RealDownload download manager','D'),
'Reaper' = (:'Reaper robot for SiteSearch','R'),


- Bil

2015-10-15 11:47 GMT+02:00 Jolle Carlestam <[hidden email]>:

> Here’s something I’ve learned today.
>
> In order to pinpoint what page the user is requesting I have a map built
> with all legitimate page paths as key and the type that’s associated with
> that call as value.
>
> It’s been fast and sufficient for my needs. But as times pass and clients
> request more and more functionality, the map grows.
>
> Today I was pondering if this approach actually is the fastest for fairly
> large maps. To find out I’ve setup a test to compare it with a match / case
> construction.
>
> Here’s the code I’ve used for testing:
>
> local(
>
>         timervalue = 1000
> )
>
> timer(#timervalue) => {^
>
>         local(
>                 bigmap = map(
>                         '1_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '2_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '3_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '4_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '5_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '6_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '7_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '8_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '9_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>                         '10_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
> // etc up to 500 items
>                         '500_Lorem' = 'Lorem ipsum dolor sit amet,
> consectetur adipiscing elit. Mauris consequat ornare lectus, dignissim
> iaculis libero consequat sed.',
>
>
>                 ),
>                 output = ''
>
>         )
>
>         loop(200,100) => {
>                 #output -> append(loop_count + #bigmap -> find(loop_count
> + '_Lorem'))
>         }
>
>         #output
> ^}
>
> timer(#timervalue) => {^
>
>         local(
>                 output = ''
>         )
>
>         loop(200,100) => {
>                 match(loop_count + '_Lorem') => {
>                         case('1_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('2_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('3_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('4_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('5_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('6_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('7_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('8_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('9_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>                         case('10_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>
> // etc up to 500 items
>                         case(’500_Lorem') #output -> append(loop_count +
> 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris consequat
> ornare lectus, dignissim iaculis libero consequat sed.')
>
>                 }
>         }
>
>         #output
>
>
> ^}
>
> My findings indicate that match / case is significantly faster than find
> in a map.
>
> Find in map:
> array((micros = 12935079.000000), (micros_average = 12935.079000))
> Match / case
> array((micros = 5641225.000000), (micros_average = 5641.225000))
>
>
> My conclusion is that if the dataset you’re looking into is fixed (as in
> almost never changes and can be defined in code) it is better from a speed
> perspective to do match / find.
>
> But, if the dataset is dynamic then it’s probably more convenient to turn
> it into a map and use that.
>
> HDB
> Jolle
>
> #############################################################
>
> This message is sent to you because you are subscribed to
>   the mailing list Lasso [hidden email]
> Official list archives available at http://www.lassotalk.com
> To unsubscribe, E-mail to: <[hidden email]>
> Send administrative queries to  <[hidden email]>

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Ke Carlton-3
In reply to this post by Jolle Carlestam-2
Hello Jolle,

Interesting testing. I believe some parts of your tests could be improved —
but still yields interesting results. With match, evaluating 1_Lorem will
always be substantially faster than 500_Lorem because it doesn't have to
evaluate 499 other things.

A better test would be random keys, searched in a (pre) randomised order,
here are my results based on 100 random keys:

   Random access 100 keys x10
   array((micros = 4593.000000), (micros_average = 459.300000)) - map
   array((micros = 3266.000000), (micros_average = 326.600000)) - hashtable
   array((micros = 17445.000000), (micros_average = 1744.500000)) - match

That said; if you know what the most frequently access values are you do
have the opportunity to optimise match + case by moving them to the top of
the list:

   Find "First item" x1000
   array((micros = 4785.000000), (micros_average = 4.785000)) - map
   array((micros = 3977.000000), (micros_average = 3.977000)) - hashtable
   array((micros = 1912.000000), (micros_average = 1.912000)) - match

However that means you're also stung when addressing the last item:

   Find "Last item" x1000
   array((micros = 5193.000000), (micros_average = 5.193000)) - map
   array((micros = 4066.000000), (micros_average = 4.066000)) - hashtable
   array((micros = 30506.000000), (micros_average = 30.506000)) - match

Same goes for a missing item:

   Find Missing item x1000
   array((micros = 4339.000000), (micros_average = 4.339000)) - map
   array((micros = 2113.000000), (micros_average = 2.113000)) - hashtable
   array((micros = 58536.000000), (micros_average = 58.536000)) - match

On the flip side, there is almost 0 setup time for the case construction (I
assume the work happens in sourcefile):

   Setup
   array((micros = 902.000000), (micros_average = 902.000000)) - created map
   array((micros = 641.000000), (micros_average = 641.000000)) - created
hastable
   array((micros = 1.000000), (micros_average = 1.000000)) - defined case

So there's definitely a situation where this would be overall quicker — but
I don't think finding that sweet spot is so straight forward. Definitely
warrants more attention.

Here's the gist: https://gist.github.com/Ke-/a453694ce1293800e6f4

Ke

On Thu, Oct 15, 2015 at 10:47 PM Jolle Carlestam <[hidden email]>
wrote:

> Here’s something I’ve learned today.
>
> In order to pinpoint what page the user is requesting I have a map built
> with all legitimate page paths as key and the type that’s associated with
> that call as value.
>
> It’s been fast and sufficient for my needs. But as times pass and clients
> request more and more functionality, the map grows.
>


> My findings indicate that match / case is significantly faster than find
> in a map.
>
> Find in map:
> array((micros = 12935079.000000), (micros_average = 12935.079000))
> Match / case
> array((micros = 5641225.000000), (micros_average = 5641.225000))
>
> My conclusion is that if the dataset you’re looking into is fixed (as in
> almost never changes and can be defined in code) it is better from a speed
> perspective to do match / find.
>
> But, if the dataset is dynamic then it’s probably more convenient to turn
> it into a map and use that.
>
> HDB
> Jolle

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Jolle Carlestam-2
In reply to this post by Bil Corry-3
15 okt 2015 kl. 13:36 skrev Bil Corry <[hidden email]>:
>
> Interesting
>
> 15 okt 2015 kl. 13:38 skrev Ke Carlton <[hidden email]>:
>
> Interesting

Indeed it is. Thanks both Bil and Ke for valuable input.

In regards to optimizing the match version it is indeed in my case specific easy to do. Since I have stats showing what pages are most frequently asked for.
Putting the high traffic requests at the top and the ones that are visited about once a month at the bottom is done in a jiffy.

HDB
Jolle

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Eric Knibbe-2
Have you tried substituting hash_map for map?
‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
Eric3
 

On 2015-10-15, at 8:08 AM, Jolle Carlestam wrote:

> 15 okt 2015 kl. 13:36 skrev Bil Corry <[hidden email]>:
>>
>> Interesting
>>
>> 15 okt 2015 kl. 13:38 skrev Ke Carlton <[hidden email]>:
>>
>> Interesting
>
> Indeed it is. Thanks both Bil and Ke for valuable input.
>
> In regards to optimizing the match version it is indeed in my case specific easy to do. Since I have stats showing what pages are most frequently asked for.
> Putting the high traffic requests at the top and the ones that are visited about once a month at the bottom is done in a jiffy.
>
> HDB
> Jolle
>
> #############################################################
>
> This message is sent to you because you are subscribed to
>  the mailing list Lasso [hidden email]
> Official list archives available at http://www.lassotalk.com
> To unsubscribe, E-mail to: <[hidden email]>
> Send administrative queries to  <[hidden email]>


#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Ke Carlton-3
I just tried with hash_map and it fails to work when inserting the
elements: Position was out of range: -141 max is 193

Ke



On Sat, Oct 24, 2015 at 6:41 AM Eric Knibbe <[hidden email]> wrote:

> Have you tried substituting hash_map for map?
> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
> Eric3
>
>
> On 2015-10-15, at 8:08 AM, Jolle Carlestam wrote:
>
> > 15 okt 2015 kl. 13:36 skrev Bil Corry <[hidden email]>:
> >>
> >> Interesting
> >>
> >> 15 okt 2015 kl. 13:38 skrev Ke Carlton <[hidden email]>:
> >>
> >> Interesting
> >
> > Indeed it is. Thanks both Bil and Ke for valuable input.
> >
> > In regards to optimizing the match version it is indeed in my case
> specific easy to do. Since I have stats showing what pages are most
> frequently asked for.
> > Putting the high traffic requests at the top and the ones that are
> visited about once a month at the bottom is done in a jiffy.
> >
> > HDB
> > Jolle
> >
> > #############################################################
> >
> > This message is sent to you because you are subscribed to
> >  the mailing list Lasso [hidden email]
> > Official list archives available at http://www.lassotalk.com
> > To unsubscribe, E-mail to: <[hidden email]>
> > Send administrative queries to  <[hidden email]>
>
>
> #############################################################
>
> This message is sent to you because you are subscribed to
>   the mailing list Lasso [hidden email]
> Official list archives available at http://www.lassotalk.com
> To unsubscribe, E-mail to: <[hidden email]>
> Send administrative queries to  <[hidden email]>

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Eric Knibbe-2
Turns out that for some cases, string->hash will return a negative value. Adding math_abs to private bucketNumber(k::trait_hashable, sz::integer)::integer makes it work, but now it seems that hash_map is noticeably slower than a plain map. I wonder if implementing a bit-shifting absolute value method that didn't use conditionals would help?

<http://source.lassosoft.com/svn/lasso/lasso9_source/trunk/hash_map.lasso>
‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
Eric3
 

On 2015-10-23, at 6:02 PM, Ke Carlton wrote:

> I just tried with hash_map and it fails to work when inserting the
> elements: Position was out of range: -141 max is 193
>
> Ke
>
>
>
> On Sat, Oct 24, 2015 at 6:41 AM Eric Knibbe <[hidden email]> wrote:
>
>> Have you tried substituting hash_map for map?
>> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
>> Eric3
>>
>>
>> On 2015-10-15, at 8:08 AM, Jolle Carlestam wrote:
>>
>>> 15 okt 2015 kl. 13:36 skrev Bil Corry <[hidden email]>:
>>>>
>>>> Interesting
>>>>
>>>> 15 okt 2015 kl. 13:38 skrev Ke Carlton <[hidden email]>:
>>>>
>>>> Interesting
>>>
>>> Indeed it is. Thanks both Bil and Ke for valuable input.
>>>
>>> In regards to optimizing the match version it is indeed in my case
>> specific easy to do. Since I have stats showing what pages are most
>> frequently asked for.
>>> Putting the high traffic requests at the top and the ones that are
>> visited about once a month at the bottom is done in a jiffy.
>>>
>>> HDB
>>> Jolle
>>>
>>> #############################################################
>>>
>>> This message is sent to you because you are subscribed to
>>> the mailing list Lasso [hidden email]
>>> Official list archives available at http://www.lassotalk.com
>>> To unsubscribe, E-mail to: <[hidden email]>
>>> Send administrative queries to  <[hidden email]>
>>
>>
>> #############################################################
>>
>> This message is sent to you because you are subscribed to
>>  the mailing list Lasso [hidden email]
>> Official list archives available at http://www.lassotalk.com
>> To unsubscribe, E-mail to: <[hidden email]>
>> Send administrative queries to  <[hidden email]>
>
> #############################################################
>
> This message is sent to you because you are subscribed to
>  the mailing list Lasso [hidden email]
> Official list archives available at http://www.lassotalk.com
> To unsubscribe, E-mail to: <[hidden email]>
> Send administrative queries to  <[hidden email]>


#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Ke Carlton-3
Why? — hash_table blitzes both:
https://github.com/zeroloop/ds/blob/master/tables.lasso



On Tue, Oct 27, 2015 at 11:33 AM Eric Knibbe <[hidden email]> wrote:

> Turns out that for some cases, string->hash will return a negative value.
> Adding math_abs to private bucketNumber(k::trait_hashable,
> sz::integer)::integer makes it work, but now it seems that hash_map is
> noticeably slower than a plain map. I wonder if implementing a bit-shifting
> absolute value method that didn't use conditionals would help?
>
> <http://source.lassosoft.com/svn/lasso/lasso9_source/trunk/hash_map.lasso>
> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
> Eric3
>
>
> On 2015-10-23, at 6:02 PM, Ke Carlton wrote:
>
> > I just tried with hash_map and it fails to work when inserting the
> > elements: Position was out of range: -141 max is 193
> >
> > Ke
> >
> >
> >
> > On Sat, Oct 24, 2015 at 6:41 AM Eric Knibbe <[hidden email]> wrote:
> >
> >> Have you tried substituting hash_map for map?
> >> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
> >> Eric3
> >>
> >>
> >> On 2015-10-15, at 8:08 AM, Jolle Carlestam wrote:
> >>
> >>> 15 okt 2015 kl. 13:36 skrev Bil Corry <[hidden email]>:
> >>>>
> >>>> Interesting
> >>>>
> >>>> 15 okt 2015 kl. 13:38 skrev Ke Carlton <[hidden email]>:
> >>>>
> >>>> Interesting
> >>>
> >>> Indeed it is. Thanks both Bil and Ke for valuable input.
> >>>
> >>> In regards to optimizing the match version it is indeed in my case
> >> specific easy to do. Since I have stats showing what pages are most
> >> frequently asked for.
> >>> Putting the high traffic requests at the top and the ones that are
> >> visited about once a month at the bottom is done in a jiffy.
> >>>
> >>> HDB
> >>> Jolle
> >>>
> >>> #############################################################
> >>>
> >>> This message is sent to you because you are subscribed to
> >>> the mailing list Lasso [hidden email]
> >>> Official list archives available at http://www.lassotalk.com
> >>> To unsubscribe, E-mail to: <[hidden email]>
> >>> Send administrative queries to  <[hidden email]>
> >>
> >>
> >> #############################################################
> >>
> >> This message is sent to you because you are subscribed to
> >>  the mailing list Lasso [hidden email]
> >> Official list archives available at http://www.lassotalk.com
> >> To unsubscribe, E-mail to: <[hidden email]>
> >> Send administrative queries to  <[hidden email]>
> >
> > #############################################################
> >
> > This message is sent to you because you are subscribed to
> >  the mailing list Lasso [hidden email]
> > Official list archives available at http://www.lassotalk.com
> > To unsubscribe, E-mail to: <[hidden email]>
> > Send administrative queries to  <[hidden email]>
>
>
> #############################################################
>
> This message is sent to you because you are subscribed to
>   the mailing list Lasso [hidden email]
> Official list archives available at http://www.lassotalk.com
> To unsubscribe, E-mail to: <[hidden email]>
> Send administrative queries to  <[hidden email]>

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Ke Carlton-3
I didn't get to finish my reply before that was sent...

Why? — hash_table blitzes both:
https://github.com/zeroloop/ds/blob/master/tables.lasso

When testing hash_map a few years ago I didn't see any major performance
gains when using it. I assume that was half the reason it was not featured
in the docs or similar.

Looking at the source of it, it seems overly complex. Here's the fix I used
to get it working:

private bucketNumber(k::trait_hashable, sz::integer)::integer =>
(#k->hash->abs % #sz)+1

Here are my results:

Random access
array((micros = 5147.000000), (micros_average = 514.700000)) - map
array((micros = 4565.000000), (micros_average = 456.500000)) - hashtable
array((micros = 16831.000000), (micros_average = 1683.100000)) - hash_map
array((micros = 15794.000000), (micros_average = 1579.400000)) - match

"First item" D90CC508-C8B9-46CA-A527-85DE0E4CF387
array((micros = 5073.000000), (micros_average = 5.073000)) - map
array((micros = 3860.000000), (micros_average = 3.860000)) - hashtable
array((micros = 12359.000000), (micros_average = 12.359000)) - hash_map
array((micros = 1721.000000), (micros_average = 1.721000)) - match

"Last item" 4E2B72D1-57E3-4E15-8650-FBB630F69510
array((micros = 5254.000000), (micros_average = 5.254000)) - map
array((micros = 3986.000000), (micros_average = 3.986000)) - hashtable
array((micros = 14238.000000), (micros_average = 14.238000)) - hash_map
array((micros = 30605.000000), (micros_average = 30.605000)) - match

Missing item
array((micros = 4893.000000), (micros_average = 4.893000)) - map
array((micros = 1995.000000), (micros_average = 1.995000)) - hashtable
array((micros = 3201.000000), (micros_average = 3.201000)) - hash_map
array((micros = 61519.000000), (micros_average = 61.519000)) - match


As you can see; hash_map is significantly slower than both map and
hashtable.

Ke


On Tue, Oct 27, 2015 at 11:52 AM Ke Carlton <[hidden email]> wrote:

> Why? — hash_table blitzes both:
> https://github.com/zeroloop/ds/blob/master/tables.lasso
>
>
>
> On Tue, Oct 27, 2015 at 11:33 AM Eric Knibbe <[hidden email]> wrote:
>
>> Turns out that for some cases, string->hash will return a negative value.
>> Adding math_abs to private bucketNumber(k::trait_hashable,
>> sz::integer)::integer makes it work, but now it seems that hash_map is
>> noticeably slower than a plain map. I wonder if implementing a bit-shifting
>> absolute value method that didn't use conditionals would help?
>>
>> <http://source.lassosoft.com/svn/lasso/lasso9_source/trunk/hash_map.lasso
>> >
>> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
>> Eric3
>>
>>
>> On 2015-10-23, at 6:02 PM, Ke Carlton wrote:
>>
>> > I just tried with hash_map and it fails to work when inserting the
>> > elements: Position was out of range: -141 max is 193
>> >
>> > Ke
>> >
>> >
>> >
>> > On Sat, Oct 24, 2015 at 6:41 AM Eric Knibbe <[hidden email]> wrote:
>> >
>> >> Have you tried substituting hash_map for map?
>> >> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
>> >> Eric3
>> >>
>> >>
>> >> On 2015-10-15, at 8:08 AM, Jolle Carlestam wrote:
>> >>
>> >>> 15 okt 2015 kl. 13:36 skrev Bil Corry <[hidden email]>:
>> >>>>
>> >>>> Interesting
>> >>>>
>> >>>> 15 okt 2015 kl. 13:38 skrev Ke Carlton <[hidden email]>:
>> >>>>
>> >>>> Interesting
>> >>>
>> >>> Indeed it is. Thanks both Bil and Ke for valuable input.
>> >>>
>> >>> In regards to optimizing the match version it is indeed in my case
>> >> specific easy to do. Since I have stats showing what pages are most
>> >> frequently asked for.
>> >>> Putting the high traffic requests at the top and the ones that are
>> >> visited about once a month at the bottom is done in a jiffy.
>> >>>
>> >>> HDB
>> >>> Jolle
>> >>>
>> >>> #############################################################
>> >>>
>> >>> This message is sent to you because you are subscribed to
>> >>> the mailing list Lasso [hidden email]
>> >>> Official list archives available at http://www.lassotalk.com
>> >>> To unsubscribe, E-mail to: <[hidden email]>
>> >>> Send administrative queries to  <[hidden email]>
>> >>
>> >>
>> >> #############################################################
>> >>
>> >> This message is sent to you because you are subscribed to
>> >>  the mailing list Lasso [hidden email]
>> >> Official list archives available at http://www.lassotalk.com
>> >> To unsubscribe, E-mail to: <[hidden email]>
>> >> Send administrative queries to  <[hidden email]>
>> >
>> > #############################################################
>> >
>> > This message is sent to you because you are subscribed to
>> >  the mailing list Lasso [hidden email]
>> > Official list archives available at http://www.lassotalk.com
>> > To unsubscribe, E-mail to: <[hidden email]>
>> > Send administrative queries to  <[hidden email]>
>>
>>
>> #############################################################
>>
>> This message is sent to you because you are subscribed to
>>   the mailing list Lasso [hidden email]
>> Official list archives available at http://www.lassotalk.com
>> To unsubscribe, E-mail to: <[hidden email]>
>> Send administrative queries to  <[hidden email]>
>
>

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Interesting speed comparison between large map find and match

Ke Carlton-3
Oh, here are the results with indextable — another data store type that
comes with DS:

Random access array((micros = 5662.000000), (micros_average = 566.200000))
- map array((micros = 3116.000000), (micros_average = 311.600000)) -
hashtable array((micros = 2624.000000), (micros_average = 262.400000)) -
indextable * array((micros = 12459.000000), (micros_average = 1245.900000))
- hash_map array((micros = 16338.000000), (micros_average = 1633.800000)) -
match "First item" D90CC508-C8B9-46CA-A527-85DE0E4CF387 array((micros =
4905.000000), (micros_average = 4.905000)) - map array((micros =
3654.000000), (micros_average = 3.654000)) - hashtable array((micros =
2415.000000), (micros_average = 2.415000)) - indextable * array((micros =
5086.000000), (micros_average = 5.086000)) - staticarray array((micros =
1938.000000), (micros_average = 1.938000)) - match "Last item"
4E2B72D1-57E3-4E15-8650-FBB630F69510 array((micros = 5152.000000),
(micros_average = 5.152000)) - map array((micros = 3591.000000),
(micros_average = 3.591000)) - hashtable array((micros = 3385.000000),
(micros_average = 3.385000)) - indextable * array((micros = 13541.000000),
(micros_average = 13.541000)) - hash_map array((micros = 29157.000000),
(micros_average = 29.157000)) - match




On Tue, Oct 27, 2015 at 12:10 PM Ke Carlton <[hidden email]> wrote:

> I didn't get to finish my reply before that was sent...
>
> Why? — hash_table blitzes both:
> https://github.com/zeroloop/ds/blob/master/tables.lasso
>
> When testing hash_map a few years ago I didn't see any major performance
> gains when using it. I assume that was half the reason it was not featured
> in the docs or similar.
>
> Looking at the source of it, it seems overly complex. Here's the fix I
> used to get it working:
>
> private bucketNumber(k::trait_hashable, sz::integer)::integer =>
> (#k->hash->abs % #sz)+1
>
> Here are my results:
>
> Random access
> array((micros = 5147.000000), (micros_average = 514.700000)) - map
> array((micros = 4565.000000), (micros_average = 456.500000)) - hashtable
> array((micros = 16831.000000), (micros_average = 1683.100000)) - hash_map
> array((micros = 15794.000000), (micros_average = 1579.400000)) - match
>
> "First item" D90CC508-C8B9-46CA-A527-85DE0E4CF387
> array((micros = 5073.000000), (micros_average = 5.073000)) - map
> array((micros = 3860.000000), (micros_average = 3.860000)) - hashtable
> array((micros = 12359.000000), (micros_average = 12.359000)) - hash_map
> array((micros = 1721.000000), (micros_average = 1.721000)) - match
>
> "Last item" 4E2B72D1-57E3-4E15-8650-FBB630F69510
> array((micros = 5254.000000), (micros_average = 5.254000)) - map
> array((micros = 3986.000000), (micros_average = 3.986000)) - hashtable
> array((micros = 14238.000000), (micros_average = 14.238000)) - hash_map
> array((micros = 30605.000000), (micros_average = 30.605000)) - match
>
> Missing item
> array((micros = 4893.000000), (micros_average = 4.893000)) - map
> array((micros = 1995.000000), (micros_average = 1.995000)) - hashtable
> array((micros = 3201.000000), (micros_average = 3.201000)) - hash_map
> array((micros = 61519.000000), (micros_average = 61.519000)) - match
>
>
> As you can see; hash_map is significantly slower than both map and
> hashtable.
>
> Ke
>
>
> On Tue, Oct 27, 2015 at 11:52 AM Ke Carlton <[hidden email]> wrote:
>
>> Why? — hash_table blitzes both:
>> https://github.com/zeroloop/ds/blob/master/tables.lasso
>>
>>
>>
>> On Tue, Oct 27, 2015 at 11:33 AM Eric Knibbe <[hidden email]> wrote:
>>
>>> Turns out that for some cases, string->hash will return a negative
>>> value. Adding math_abs to private bucketNumber(k::trait_hashable,
>>> sz::integer)::integer makes it work, but now it seems that hash_map is
>>> noticeably slower than a plain map. I wonder if implementing a bit-shifting
>>> absolute value method that didn't use conditionals would help?
>>>
>>> <
>>> http://source.lassosoft.com/svn/lasso/lasso9_source/trunk/hash_map.lasso
>>> >
>>> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
>>> Eric3
>>>
>>>
>>> On 2015-10-23, at 6:02 PM, Ke Carlton wrote:
>>>
>>> > I just tried with hash_map and it fails to work when inserting the
>>> > elements: Position was out of range: -141 max is 193
>>> >
>>> > Ke
>>> >
>>> >
>>> >
>>> > On Sat, Oct 24, 2015 at 6:41 AM Eric Knibbe <[hidden email]>
>>> wrote:
>>> >
>>> >> Have you tried substituting hash_map for map?
>>> >> ‐‐‐‐‐‐‐‐‐‐✂‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
>>> >> Eric3
>>> >>
>>> >>
>>> >> On 2015-10-15, at 8:08 AM, Jolle Carlestam wrote:
>>> >>
>>> >>> 15 okt 2015 kl. 13:36 skrev Bil Corry <[hidden email]>:
>>> >>>>
>>> >>>> Interesting
>>> >>>>
>>> >>>> 15 okt 2015 kl. 13:38 skrev Ke Carlton <[hidden email]>:
>>> >>>>
>>> >>>> Interesting
>>> >>>
>>> >>> Indeed it is. Thanks both Bil and Ke for valuable input.
>>> >>>
>>> >>> In regards to optimizing the match version it is indeed in my case
>>> >> specific easy to do. Since I have stats showing what pages are most
>>> >> frequently asked for.
>>> >>> Putting the high traffic requests at the top and the ones that are
>>> >> visited about once a month at the bottom is done in a jiffy.
>>> >>>
>>> >>> HDB
>>> >>> Jolle
>>> >>>
>>> >>> #############################################################
>>> >>>
>>> >>> This message is sent to you because you are subscribed to
>>> >>> the mailing list Lasso [hidden email]
>>> >>> Official list archives available at http://www.lassotalk.com
>>> >>> To unsubscribe, E-mail to: <[hidden email]>
>>> >>> Send administrative queries to  <[hidden email]>
>>> >>
>>> >>
>>> >> #############################################################
>>> >>
>>> >> This message is sent to you because you are subscribed to
>>> >>  the mailing list Lasso [hidden email]
>>> >> Official list archives available at http://www.lassotalk.com
>>> >> To unsubscribe, E-mail to: <[hidden email]>
>>> >> Send administrative queries to  <[hidden email]>
>>> >
>>> > #############################################################
>>> >
>>> > This message is sent to you because you are subscribed to
>>> >  the mailing list Lasso [hidden email]
>>> > Official list archives available at http://www.lassotalk.com
>>> > To unsubscribe, E-mail to: <[hidden email]>
>>> > Send administrative queries to  <[hidden email]>
>>>
>>>
>>> #############################################################
>>>
>>> This message is sent to you because you are subscribed to
>>>   the mailing list Lasso [hidden email]
>>> Official list archives available at http://www.lassotalk.com
>>> To unsubscribe, E-mail to: <[hidden email]>
>>> Send administrative queries to  <[hidden email]>
>>
>>

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>