[PATCH 1/1] global: provide memrchr implementation for macOS
Markus Mayer
code at mmayer.net
Sun Jan 29 21:43:29 UTC 2023
On Fri, 27 Jan 2023 at 17:16, Glenn Strauss <gstrauss at gluelogic.com> wrote:
>
> Fun! A small exercise for comparison if you like.
> Cheers, Glenn
>
> void *
> memrchr(const void *s, int c, size_t n)
> {
> const unsigned char *cp = (const unsigned char *)s + n;
> const unsigned char ch = (unsigned char)c;
> while (s != cp) {
> if (*(--cp) == ch)
> return (void *)cp;
> }
> return NULL;
> }
Hi all,
As promised, I played around a bit. I ran a few experiments with
different memrchr() implementations. Everything I did can be found
here:
https://github.com/mmayer/cgit/tree/memrchr-compare
The test-specific code is in the memrchr_test folder[1] within that repo.
The four implementations I tried are:
memrchr: the original implementation (from Apple's sudo command) that
I submitted as v1
memrchr2: Alejandro's suggestion
memrchr3: Glen's suggestion
memrchr4: for added fun, musl-libc's implementation[2]
I also checked the object and assembly files into the repo, so it's
easier to look at them if anybody wants to. They live in the
memrchr_test/output folder.
Here are the results for ARM and x86, both in assembly/object size and runtime.
ARM
# Object size of memrchr and memrchr2 is the same
-rw-r--r-- 1 mmayer staff 552 29 Jan 09:52 memrchr.o
-rw-r--r-- 1 mmayer staff 552 29 Jan 09:52 memrchr2.o
-rw-r--r-- 1 mmayer staff 544 29 Jan 09:52 memrchr3.o
-rw-r--r-- 1 mmayer staff 544 29 Jan 09:52 memrchr4.o
# Assembly source of memrchr2 is larger than memrchr
-rw-r--r-- 1 mmayer staff 694 29 Jan 09:52 memrchr2.s
-rw-r--r-- 1 mmayer staff 691 29 Jan 09:52 memrchr.s
-rw-r--r-- 1 mmayer staff 655 29 Jan 09:52 memrchr3.s
-rw-r--r-- 1 mmayer staff 655 29 Jan 09:52 memrchr4.s
execution time: 18.61453 seconds
execution time: 15.39163 seconds
execution time: 13.56957 seconds
execution time: 13.55493 seconds
x86
-rw-r--r-- 1 mmayer staff 656 29 Jan 10:02 memrchr.o
-rw-r--r-- 1 mmayer staff 656 29 Jan 10:02 memrchr2.o
-rw-r--r-- 1 mmayer staff 656 29 Jan 10:02 memrchr3.o
-rw-r--r-- 1 mmayer staff 648 29 Jan 10:02 memrchr4.o
-rw-r--r-- 1 mmayer staff 835 29 Jan 10:02 memrchr.s
-rw-r--r-- 1 mmayer staff 835 29 Jan 10:02 memrchr2.s
-rw-r--r-- 1 mmayer staff 825 29 Jan 10:02 memrchr3.s
-rw-r--r-- 1 mmayer staff 818 29 Jan 10:02 memrchr4.s
execution time: 20.29937 seconds
execution time: 23.67755 seconds
execution time: 12.59514 seconds
execution time: 11.38668 seconds
As you can see, musl-libc provides the smallest implementation that is
also the fastest. This is true for ARM and x86. So, I guess it makes
the most sense to pick that (memrchr4.c in my experiments). The code
is under a MIT license, which I assume is fine for CGIT.
What does everybody think?
Regards,
-Markus
[1] https://github.com/mmayer/cgit/tree/memrchr-compare/memrchr_test
[2] https://git.musl-libc.org/cgit/musl/tree/src/string/memrchr.c
More information about the CGit
mailing list