26 Apr 2024

feedPlanet Mozilla

The Servo Blog: This month in Servo: Acid2 redux, Servo book, Qt demo, and more!

Servo nightly, now rendering Acid2 perfectly <figcaption>Servo now renders Acid2 perfectly, but like all browsers, only at 1x dpi.</figcaption>

Back in November, Servo's new layout engine passed Acid1, and this month, thanks to a bug-squashing sprint by @mrobinson and @Loirooriol, we now pass Acid2!

We would also like to thank you all for your generous support! Since we moved to Open Collective and GitHub Sponsors in March, we have received 1578 USD (after fees), including 1348 USD/month (before fees) in recurring donations. This smashed our first two goals, and is a respectable part of the way towards our next goal of 10000 USD/month. For more details, see our Sponsorship page and announcement post.

1348 USD/month
10000

We are still receiving donations from 19 people on LFX, and we're working on transferring the balance to our new fund, but we will stop accepting donations there soon - please move your recurring donations to GitHub or Open Collective. As always, use of these funds will be decided transparently in the Technical Steering Committee, starting with the TSC meeting on 29 April.

The Servo book, a book much like the Rust book

Servo's docs are moving to the Servo book, and a very early version of this is now online (@delan, servo/book)! The goal is to unify our many sources of documentation, including the hacking quickstart guide, building Servo page, Servo design page, and other in-tree docs and wiki pages, into a book that's richer and easier to search and navigate.

Servo now supports several new features in its nightly builds:

As of 2024-04-05, we now support non-autoplay <video> (@eerii, media#419, #32001), as long as the page provides its own controls, as well as the 'baseline-source' property (@MunishMummadi, #31904, #31913). Both of these contributors started out as Outreachy participants, and we're thrilled to see their continued work on improving Servo.

We've also landed several other rendering improvements:

Our font rendering has improved, with support for selecting the correct weight and style in indexed fonts (.ttc) on Linux (@mukilan, @mrobinson, #32127), as well as support for emoji font fallback on macOS (@mrobinson, #32122). Note that color emoji are not yet supported.

Other big changes are coming to Servo's font loading and rendering, thanks to @mrobinson's font system redesign RFC (#32033). Work has already started on this (@mrobinson, @mukilan, #32034, #32038, #32100, #32101, #32115), with the eventual goal of making font data zero-copy readable from multiple threads. This in turn will fix several major issues with font caching, including cached font data leaking over time and between pages, unnecessary loading from disk, and unnecessary copying to layout.

We've also started simplifying the script-layout interface (@mrobinson, #31937, #32081), since layout was merged into the script thread, and script can now call into layout without IPC.

Embedding and dev changes

Servo running in a Qt app via CXX-Qt <figcaption>The prototype shows that Servo can be integrated with a Qt app via CXX-Qt.</figcaption>

A prototype for integrating Servo with Qt was built by @ahayzen-kdab and @vimpostor and shown at Embedded World 2024. We're looking forward to incorporating their feedback from this to improve Servo's embedding API. For more details, check out their GitHub repo and Embedding the Servo Web Engine in Qt.

Servo now supports multiple concurrent webviews (@wusyong, @delan, @atbrakhi, #31417, #32067)! This is a big step towards making Servo a viable embedded webview, and we will soon use it to implement tabbed browsing in servoshell (@delan, #31545).

Three of the slowest crates in the Servo build process are mozjs_sys, mozangle, and script. The first two compile some very large C++ libraries in their build scripts - SpiderMonkey and ANGLE respectively - and the third blocks on the first two. They can account for over two minutes of build time, even on a very fast machine (AMD 7950X), and a breaking change in newer versions of GNU Make (mozjs#375) can make mozjs_sys take over eight minutes to build!

mozjs_sys now uses a prebuilt version of SpiderMonkey by default (@wusyong, @sagudev, mozjs#450, #31824), cutting clean build times by over seven minutes on a very fast machine (see above). On Linux with Nix (the package manager), where we run an unaffected version of GNU Make, it can still save over 100 seconds on a quad-core CPU with SMT. Further savings will be possible once we do the same for mozangle.

If you use NixOS, or any Linux distro with Nix, you can now get a shell with all of the tools and dependencies needed to build and run Servo by typing nix-shell (@delan, #32035), without also needing to type etc/shell.nix.

As for CI, our experimental Android build now supports aarch64 (@mukilan, #32137), in addition to Android on armv7, x86_64, and i686, and we've improved flakiness in the WebGPU tests (@sagudev, #31952) and macOS builds (@mrobinson, #32005).

Conferences and events

Earlier this month, Rakhi Sharma gave her talk A year of Servo reboot: where are we now? at Open Source Summit North America (slides; recording available soon) and at the Seattle Rust User Group (slides).

In the Netherlands, Gregory Terzian will be presenting Modular Servo: Three Paths Forward at the GOSIM Conference 2024, on 6 May at 15:10 local time (13:10 UTC). That's the same venue as RustNL 2024, just one day earlier, and you can also find Gregory, Rakhi, and Nico at RustNL afterwards. See you there!

26 Apr 2024 12:00am GMT

25 Apr 2024

feedPlanet Mozilla

Will Kahn-Greene: crashstats-tools v2.0.0 released!

What is it?

crashstats-tools is a set of command-line tools for working with Crash Stats (https://crash-stats.mozilla.org/).

crashstats-tools comes with four commands:

  • supersearch: for performing Crash Stats Super Search queries

  • supersearchfacet: for performing aggregations, histograms, and cardinality Crash Stats Super Search queries

  • fetch-data: for fetching raw crash, dumps, and processed crash data for specified crash ids

  • reprocess: for sending crash report reprocess requests

v2.0.0 released!

There have been a lot of improvements since the last blog post for the v1.0.1 release. New commands, new features, improved cli ui, etc.

v2.0.0 focused on two major things:

  1. improving supersearchfacet to support nested aggregation, histogram, and cardinality queries

  2. moving some of the code into a crashstats_tools.libcrashstats module improving its use as a library

Improved supersearchfacet

The other day, Alex and team finished up the crash reporter Rust rewrite. The crash reporter rewrite landed and is available in Firefox, nightly channel, where build_id >= 20240321093532.

The crash reporter is one of the clients that submits crash reports to Socorro which is now maintained by the Observability Team. Firefox has multiple crash reporter clients and there are many ways that crash reports can get submitted to Socorro.

One of the changes we can see in the crash report data now is the change in User-Agent header. The new rewritten crash reporter sends a header of crash-reporter/1.0.0. That gets captured by the collector and put in the raw crash metadata.user_agent field. It doesn't get indexed, so we can't search on it directly.

We can get a sampling of the last 100 crash reports, download the raw crash data, and look at the user agents.

$ supersearch --num=100 --product=Firefox --build_id='>=20240321093532' \
    --release_channel=nightly > crashids.txt
$ fetch-data --raw --no-dumps --no-processed crashdata < crashids.txt
$ jq .metadata.user_agent crashdata/raw_crash/*/* | sort | uniq -c
     16 "crashreporter/1.0.0"
      2 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0"
      1 "Mozilla/5.0 (Windows NT 10.0; rv:127.0) Gecko/20100101 Firefox/127.0"
      2 "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:126.0) Gecko/20100101 Firefox/126.0"
     63 "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:127.0) Gecko/20100101 Firefox/127.0"
      1 "Mozilla/5.0 (X11; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0"
     12 "Mozilla/5.0 (X11; Linux x86_64; rv:127.0) Gecko/20100101 Firefox/127.0"
      3 "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:127.0) Gecko/20100101 Firefox/127.0"

16 out of 100 crash reports were submitted by the new crash reporter. We were surprised there are so many Firefox user agents. We discussed this on Slack. I loosely repeat it here because it's a great way to show off some of the changes of supersearchfacet in v2.0.0.

First, the rewritten crash reporter only affects the parent (aka main) process. The other processes have different crash reporters that weren't rewritten.

How many process types are there for Firefox crash reports in the last week? We can see that in the ProcessType annotation (docs) which is processed and saved in the process_type field (docs).

$ supersearchfacet --product=Firefox --build_id='>=20240321093532' --release_channel=nightly
    --_facets=process_type
process_type
 process_type | count
--------------|-------
 content      | 3664
 parent       | 2323
 gpu          | 855
 utility      | 225
 rdd          | 60
 plugin       | 18
 socket       | 2
 total        | 7147

Judging by that output, I would expect to see a higher percentage of crashreporter/1.0.0 in our sampling of 100 crash reports.

Turns out that Firefox uses different code to submit crash reports not just by process type, but also by user action. That's in the SubmittedFrom annotation (docs) which is processed and saved in the submitted_from field (docs).

$ supersearchfacet --product=Firefox --build_id='>=20240321093532' --release_channel=nightly \
    --_facets=submitted_from
submitted_from
 submitted_from | count
----------------|-------
 Auto           | 3477
 Client         | 1741
 CrashedTab     | 928
 Infobar        | 792
 AboutCrashes   | 209
 total          | 7147

What is "Auto"? The user can opt-in to auto-send crash reports. When Firefox upgrades and this setting is set, then Firefox will auto-send any unsubmitted crash reports. The nightly channel has two updates a day, so there's lots of opportunity for this event to trigger.

What're the counts for submitted_from/process_type pairs?

$ supersearchfacet --product=Firefox --build_id='>=20240321093532' --release_channel=nightly \
    --_aggs.process_type=submitted_from
process_type / submitted_from
 process_type / submitted_from | count
-------------------------------|-------
 content / Auto                | 2214
 content / CrashedTab          | 926
 content / Infobar             | 399
 content / AboutCrashes        | 125
 parent / Client               | 1741
 parent / Auto                 | 450
 parent / Infobar              | 107
 parent / AboutCrashes         | 25
 gpu / Auto                    | 565
 gpu / Infobar                 | 236
 gpu / AboutCrashes            | 54
 utility / Auto                | 198
 utility / Infobar             | 25
 utility / AboutCrashes        | 2
 rdd / Auto                    | 34
 rdd / Infobar                 | 23
 rdd / AboutCrashes            | 3
 plugin / Auto                 | 14
 plugin / CrashedTab           | 2
 plugin / Infobar              | 2
 socket / Auto                 | 2
 total                         | 7147

We can spot check these different combinations to see what the user-agent looks like.

For brevity, we'll just look at parent / Client in this blog post.

$ supersearch --num=100 --product=Firefox --build_id='>=20240321093532' --release_channel=nightly \
    --process_type=parent --submitted_from='~Client' > crashids_clarified.txt
$ fetch-data --raw --no-dumps --no-processed crashdata_clarified < crashids_clarified.txt
$ jq .metadata.user_agent crashdata_clarified/raw_crash/*/* | sort | uniq -c
    100 "crashreporter/1.0.0"

Seems like the crash reporter rewrite only affects crash reports where ProcessType=parent and SubmittedFrom=Client. All the other process_type/submitted_from combinations get submitted a different way where the user agent is the browser itself.

How many crash reports has the new crash reporter submitted over time?

$ supersearchfacet --_histogram.date=product --_histogram.interval=1d --denote-weekends \
    --date='>=2024-03-20' --date='<=2024-04-25' \
    --release_channel=nightly --product=Firefox --build_id='>=20240321093532' \
    --submitted_from='~Client' --process_type=parent
histogram_date.product
 histogram_date | Firefox | total
----------------|---------|-------
 2024-03-21     | 58      | 58
 2024-03-22     | 124     | 124
 2024-03-23 **  | 189     | 189
 2024-03-24 **  | 289     | 289
 2024-03-25     | 202     | 202
 2024-03-26     | 164     | 164
 2024-03-27     | 199     | 199
 2024-03-28     | 187     | 187
 2024-03-29     | 188     | 188
 2024-03-30 **  | 155     | 155
 2024-03-31 **  | 146     | 146
 2024-04-01     | 201     | 201
 2024-04-02     | 226     | 226
 2024-04-03     | 236     | 236
 2024-04-04     | 266     | 266
 2024-04-05     | 259     | 259
 2024-04-06 **  | 227     | 227
 2024-04-07 **  | 214     | 214
 2024-04-08     | 259     | 259
 2024-04-09     | 257     | 257
 2024-04-10     | 223     | 223
 2024-04-11     | 250     | 250
 2024-04-12     | 235     | 235
 2024-04-13 **  | 154     | 154
 2024-04-14 **  | 162     | 162
 2024-04-15     | 207     | 207
 2024-04-16     | 201     | 201
 2024-04-17     | 346     | 346
 2024-04-18     | 270     | 270
 2024-04-19     | 221     | 221
 2024-04-20 **  | 190     | 190
 2024-04-21 **  | 183     | 183
 2024-04-22     | 266     | 266
 2024-04-23     | 303     | 303
 2024-04-24     | 308     | 308

There are more examples in the crashstats-tools README.

crashstats_tools.libcrashstats library

Starting with v2.0.0, you can use crashstats_tools.libcrashstats as a library for Python scripts.

For example:

from crashstats_tools.libcrashstats import supersearch

results = supersearch(params={"_columns": "uuid"}, num_results=100)

for result in results:
    print(f"{result}")

libcrashstats makes using the Crash Stats API a little more ergonomic.

See the crashstats_tools.libcrashstats library documentation.

Be thoughtful about using data

Make sure to use these tools in compliance with our data policy:

https://crash-stats.mozilla.org/documentation/protected_data_access/

Where to go for more

See the project on GitHub which includes a README which contains everything about the project including examples of usage, the issue tracker, and the source code:

https://github.com/willkg/crashstats-tools

Let me know whether this helps you!

25 Apr 2024 4:00pm GMT

Hacks.Mozilla.Org: Llamafile’s progress, four months in

When Mozilla's Innovation group first launched the llamafile project late last year, we were thrilled by the immediate positive response from open source AI developers. It's become one of Mozilla's top three most-favorited repositories on GitHub, attracting a number of contributors, some excellent PRs, and a growing community on our Discord server.

Through it all, lead developer and project visionary Justine Tunney has remained hard at work on a wide variety of fundamental improvements to the project. Just last night, Justine shipped the v0.8 release of llamafile, which includes not only support for the very latest open models, but also a number of big performance improvements for CPU inference.

As a result of Justine's work, today llamafile is both the easiest and fastest way to run a wide range of open large language models on your own hardware. See for yourself: with llamafile, you can run Meta's just-released LLaMA 3 model-which rivals the very best models available in its size class-on an everyday Macbook.

How did we do it? To explain that, let's take a step back and tell you about everything that's changed since v0.1.

tinyBLAS: democratizing GPU support for NVIDIA and AMD

llamafile is built atop the now-legendary llama.cpp project. llama.cpp supports GPU-accelerated inference for NVIDIA processors via the cuBLAS linear algebra library, but that requires users to install NVIDIA's CUDA SDK. We felt uncomfortable with that fact, because it conflicts with our project goal of building a fully open-source and transparent AI stack that anyone can run on commodity hardware. And besides, getting CUDA set up correctly can be a bear on some systems. There had to be a better way.

With the community's help (here's looking at you, @ahgamut and @mrdomino!), we created our own solution: it's called tinyBLAS, and it's llamafile's brand-new and highly efficient linear algebra library. tinyBLAS makes NVIDIA acceleration simple and seamless for llamafile users. On Windows, you don't even need to install CUDA at all; all you need is the display driver you've probably already installed.

But tinyBLAS is about more than just NVIDIA: it supports AMD GPUs, as well. This is no small feat. While AMD commands a respectable 20% of today's GPU market, poor software and driver support have historically made them a secondary player in the machine learning space. That's a shame, given that AMD's GPUs offer high performance, are price competitive, and are widely available.

One of llamafile's goals is to democratize access to open source AI technology, and that means getting AMD a seat at the table. That's exactly what we've done: with llamafile's tinyBLAS, you can now easily make full use of your AMD GPU to accelerate local inference. And, as with CUDA, if you're a Windows user you don't even have to install AMD's ROCm SDK.

All of this means that, for many users, llamafile will automatically use your GPU right out of the box, with little to no effort on your part.

CPU performance gains for faster local AI

Here at Mozilla, we are keenly interested in the promise of "local AI," in which AI models and applications run directly on end-user hardware instead of in the cloud. Local AI is exciting because it opens up the possibility of more user control over these systems and greater privacy and security for users.

But many consumer devices lack the high-end GPUs that are often required for inference tasks. llama.cpp has been a game-changer in this regard because it makes local inference both possible and usably performant on CPUs instead of just GPUs.

Justine's recent work on llamafile has now pushed the state of the art even further. As documented in her detailed blog post on the subject, by writing 84 new matrix multiplication kernels she was able to increase llamafile's prompt evaluation performance by an astonishing 10x compared to our previous release. This is a substantial and impactful step forward in the quest to make local AI viable on consumer hardware.

This work is also a great example of our commitment to the open source AI community. After completing this work we immediately submitted a PR to upstream these performance improvements to llama.cpp. This was just the latest of a number of enhancements we've contributed back to llama.cpp, a practice we plan to continue.

Raspberry Pi performance gains

Speaking of consumer hardware, there are few examples that are both more interesting and more humble than the beloved Raspberry Pi. For a bargain basement price, you get a full-featured computer running Linux with plenty of computing power for typical desktop uses. It's an impressive package, but historically it hasn't been considered a viable platform for AI applications.

Not any more. llamafile has now been optimized for the latest model (the Raspberry Pi 5), and the result is that a number of small LLMs-such as Rocket-3B (download), TinyLLaMA-1.5B (download), and Phi-2 (download)-run at usable speeds on one of the least expensive computers available today. We've seen prompt evaluation speeds of up to 80 tokens/sec in some cases!

Keeping up with the latest models

The pace of progress in the open model space has been stunningly fast. Over the past few months, hundreds of models have been released or updated via fine-tuning. Along the way, there has been a clear trend of ever-increasing model performance and ever-smaller model sizes.

The llama.cpp project has been doing an excellent job of keeping up with all of these new models, frequently rolling-out support for new architectures and model features within days of their release.

For our part we've been keeping llamafile closely synced with llama.cpp so that we can support all the same models. Given the complexity of both projects, this has been no small feat, so we're lucky to have Justine on the case.

Today, you can today use the very latest and most capable open models with llamafile thanks to her hard work. For example, we were able to roll-out llamafiles for Meta's newest LLaMA 3 models-8B-Instruct and 70B-Instruct-within a day of their release. With yesterday's 0.8 release, llamafile can also run Grok, Mixtral 8x22B, and Command-R.

Creating your own llamafiles

Since the day that llamafile shipped people have wanted to create their own llamafiles. Previously, this required a number of steps, but today you can do it with a single command, e.g.:

llamafile-convert [model.gguf]

In just moments, this will produce a "model.llamafile" file that is ready for immediate use. Our thanks to community member @chan1012 for contributing this helpful improvement.

In a related development, Hugging Face recently added official support for llamafile within their model hub. This means you can now search and filter Hugging Face specifically for llamafiles created and distributed by other people in the open source community.

OpenAI-compatible API server

Since it's built on top of llama.cpp, llamafile inherits that project's server component, which provides OpenAI-compatible API endpoints. This enables developers who are building on top of OpenAI to switch to using open models instead. At Mozilla we very much want to support this kind of future: one where open-source AI is a viable alternative to centralized, closed, commercial offerings.

While open models do not yet fully rival the capabilities of closed models, they're making rapid progress. We believe that making it easier to pivot existing code over to executing against open models will increase demand and further fuel this progress.

Over the past few months, we've invested effort in extending these endpoints, both to increase functionality and improve compatibility. Today, llamafile can serve as a drop-in replacement for OpenAI in a wide variety of use cases.

We want to further extend our API server's capabilities, and we're eager to hear what developers want and need. What's holding you back from using open models? What features, capabilities, or tools do you need? Let us know!

Integrations with other open source AI projects

Finally, it's been a delight to see llamafile adopted by independent developers and integrated into leading open source AI projects (like Open Interpreter). Kudos in particular to our own Kate Silverstein who landed PRs that add llamafile support to LangChain and LlamaIndex (with AutoGPT coming soon).

If you're a maintainer or contributor to an open source AI project that you feel would benefit from llamafile integration, let us know how we can help.

Join us!

The llamafile project is just getting started, and it's also only the first step in a major new initiative on Mozilla's part to contribute to and participate in the open source AI community. We'll have more to share about that soon, but for now: I invite you to join us on the llamafile project!

The best place to connect with both the llamafile team at Mozilla and the overall llamafile community is over at our Discord server, which has a dedicated channel just for llamafile. And of course, your enhancement requests, issues, and PRs are always welcome over at our GitHub repo.

I hope you'll join us. The next few months are going to be even more interesting and unexpected than the last, both for llamafile and for open source AI itself.

The post Llamafile's progress, four months in appeared first on Mozilla Hacks - the Web developer blog.

25 Apr 2024 3:34pm GMT