[PATCH 1/1] global: provide memrchr implementation for macOS

Markus Mayer code at mmayer.net
Sun Jan 29 21:43:29 UTC 2023


On Fri, 27 Jan 2023 at 17:16, Glenn Strauss <gstrauss at gluelogic.com> wrote:
>
> Fun!  A small exercise for comparison if you like.
> Cheers, Glenn
>
> void *
> memrchr(const void *s, int c, size_t n)
> {
>     const unsigned char *cp = (const unsigned char *)s + n;
>     const unsigned char ch = (unsigned char)c;
>     while (s != cp) {
>         if (*(--cp) == ch)
>             return (void *)cp;
>     }
>     return NULL;
> }

Hi all,

As promised, I played around a bit. I ran a few experiments with
different memrchr() implementations. Everything I did can be found
here:

https://github.com/mmayer/cgit/tree/memrchr-compare

The test-specific code is in the memrchr_test folder[1] within that repo.

The four implementations I tried are:

memrchr: the original implementation (from Apple's sudo command) that
I submitted as v1
memrchr2: Alejandro's suggestion
memrchr3: Glen's suggestion
memrchr4: for added fun, musl-libc's implementation[2]

I also checked the object and assembly files into the repo, so it's
easier to look at them if anybody wants to. They live in the
memrchr_test/output folder.

Here are the results for ARM and x86, both in assembly/object size and runtime.

ARM

# Object size of memrchr and memrchr2 is the same
-rw-r--r--  1 mmayer  staff  552 29 Jan 09:52 memrchr.o
-rw-r--r--  1 mmayer  staff  552 29 Jan 09:52 memrchr2.o
-rw-r--r--  1 mmayer  staff  544 29 Jan 09:52 memrchr3.o
-rw-r--r--  1 mmayer  staff  544 29 Jan 09:52 memrchr4.o

# Assembly source of memrchr2 is larger than memrchr
-rw-r--r--  1 mmayer  staff  694 29 Jan 09:52 memrchr2.s
-rw-r--r--  1 mmayer  staff  691 29 Jan 09:52 memrchr.s
-rw-r--r--  1 mmayer  staff  655 29 Jan 09:52 memrchr3.s
-rw-r--r--  1 mmayer  staff  655 29 Jan 09:52 memrchr4.s

execution time: 18.61453 seconds
execution time: 15.39163 seconds
execution time: 13.56957 seconds
execution time: 13.55493 seconds

x86

-rw-r--r--  1 mmayer  staff  656 29 Jan 10:02 memrchr.o
-rw-r--r--  1 mmayer  staff  656 29 Jan 10:02 memrchr2.o
-rw-r--r--  1 mmayer  staff  656 29 Jan 10:02 memrchr3.o
-rw-r--r--  1 mmayer  staff  648 29 Jan 10:02 memrchr4.o

-rw-r--r--  1 mmayer  staff  835 29 Jan 10:02 memrchr.s
-rw-r--r--  1 mmayer  staff  835 29 Jan 10:02 memrchr2.s
-rw-r--r--  1 mmayer  staff  825 29 Jan 10:02 memrchr3.s
-rw-r--r--  1 mmayer  staff  818 29 Jan 10:02 memrchr4.s

execution time: 20.29937 seconds
execution time: 23.67755 seconds
execution time: 12.59514 seconds
execution time: 11.38668 seconds

As you can see, musl-libc provides the smallest implementation that is
also the fastest. This is true for ARM and x86. So, I guess it makes
the most sense to pick that (memrchr4.c in my experiments). The code
is under a MIT license, which I assume is fine for CGIT.

What does everybody think?

Regards,
-Markus

[1] https://github.com/mmayer/cgit/tree/memrchr-compare/memrchr_test
[2] https://git.musl-libc.org/cgit/musl/tree/src/string/memrchr.c


More information about the CGit mailing list