25 May 2021

feedPlanet Maemo

GStreamer WebKit debugging by using external tools (2/2)

This is the last post of the series showing interesting debugging tools, I hope you have found it useful. Don't miss the custom scripts at the bottom to process GStreamer logs, help you highlight the interesting parts and find the root cause of difficult bugs. Here are also the previous posts of the series:

How to debug pkgconfig

When pkg-config finds the PKG_CONFIG_DEBUG_SPEW env var, it explains all the steps used to resolve the packages:

PKG_CONFIG_DEBUG_SPEW=1 /usr/bin/pkg-config --libs x11

This is useful to know why a particular package isn't found and what are the default values for PKG_CONFIG_PATH when it's not defined. For example:

Adding directory '/usr/local/lib/x86_64-linux-gnu/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/local/lib/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/local/share/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/lib/x86_64-linux-gnu/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/lib/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/share/pkgconfig' from PKG_CONFIG_PATH

If we have tuned PKG_CONFIG_PATH, maybe we also want to add the default paths. For example:

SYSROOT=~/sysroot-x86-64
export PKG_CONFIG_PATH=${SYSROOT}/usr/local/lib/pkgconfig:${SYSROOT}/usr/lib/pkgconfig
# Add also the standard pkg-config paths to find libraries in the system
export PKG_CONFIG_PATH=${PKG_CONFIG_PATH}:/usr/local/lib/x86_64-linux-gnu/pkgconfig:\
/usr/local/lib/pkgconfig:/usr/local/share/pkgconfig:/usr/lib/x86_64-linux-gnu/pkgconfig:\
/usr/lib/pkgconfig:/usr/share/pkgconfig
# This tells pkg-config where the "system" pkg-config dir is. This is useful when cross-compiling for other
# architecture, to avoid pkg-config using the system .pc files and mixing host and target libraries
export PKG_CONFIG_LIBDIR=${SYSROOT}/usr/lib
# This could have been used for cross compiling:
#export PKG_CONFIG_SYSROOT_DIR=${SYSROOT}

Man in the middle proxy for WebKit

Sometimes it's useful to use our own modified/unminified files with a 3rd party service we don't control. Mitmproxy can be used as a man-in-the-middle proxy, but I haven't tried it personally yet. What I have tried (with WPE) is this:

  1. Add an /etc/hosts entry to point the host serving the files we want to change to an IP address controlled by us.
  2. Configure a web server to provide the files in the expected path.
  3. Modify the ResourceRequestBase constructor to change the HTTPS requests to HTTP when the hostname matches the target:
ResourceRequestBase(const URL& url, ResourceRequestCachePolicy policy)
    : m_url(url)
    , m_timeoutInterval(s_defaultTimeoutInterval)
    ...
    , m_isAppBound(false)
{
    if (m_url.host().toStringWithoutCopying().containsIgnoringASCIICase(String("out-of-control-service.com"))
        && m_url.protocol().containsIgnoringASCIICase(String("https"))) {
        printf("### %s: URL %s detected, changing from https to http\n",
            __PRETTY_FUNCTION__, m_url.string().utf8().data());
        fflush(stdout);
        m_url.setProtocol(String("http"));
    }
}

:bulb: Pro tip: If you have to debug minified/obfuscated JavaScript code and don't have a deobfuscated version to use in a man-in-the-middle fashion, use http://www.jsnice.org/ to deobfuscate it and get meaningful variable names.

Bandwidth control for a dependent device

If your computer has a "shared internet connection" enabled in Network Manager and provides access to a dependent device , you can control the bandwidth offered to that device. This is useful to trigger quality changes on adaptive streaming videos from services out of your control.

This can be done using tc, the Traffic Control tool from the Linux kernel. You can use this script to automate the process (edit it to suit to your needs).

Useful scripts to process GStreamer logs

I use these scripts in my daily job to look for strange patterns in GStreamer logs that help me to find the cause of the bugs I'm debugging:

0 Add to favourites0 Bury

25 May 2021 6:00am GMT

18 May 2021

feedPlanet Maemo

GStreamer WebKit debugging by using external tools (1/2)

In this new post series, I'll show you how both existing and ad-hoc tools can be helpful to find the root cause of some problems. Here are also the older posts of this series in case you find them useful:

Use strace to know which config/library files are used by a program

If you're becoming crazy supposing that the program should use some config and it seems to ignore it, just use strace to check what config files, libraries or other kind of files is the program actually using. Use the grep rules you need to refine the search:

$ strace -f -e trace=%file nano 2> >(grep 'nanorc')
access("/etc/nanorc", R_OK)             = 0
access("/usr/share/nano/javascript.nanorc", R_OK) = 0
access("/usr/share/nano/gentoo.nanorc", R_OK) = 0
...

Know which process is killing another one

First, try to strace -e trace=signal -p 1234 the killed process.

If that doesn't work (eg: because it's being killed with the uncatchable SIGKILL signal), then you can resort to modifying the kernel source code (signal.c) to log the calls to kill():

SYSCALL_DEFINE2(kill, pid_t, pid, int, sig)
{
    struct task_struct *tsk_p;
    ...
    /* Log SIGKILL */
    if (sig & 0x1F == 9) {
        tsk_p = find_task_by_vpid(pid);

        if (tsk_p) {
            printk(KERN_DEBUG "Sig: %d from pid: %d (%s) to pid: %d (%s)\n",
                sig, current->pid, current->comm, pid, tsk_p->comm);
        } else {
            printk(KERN_DEBUG "Sig: %d from pid: %d (%s) to pid: %d\n",
                sig, current->pid, current->comm, pid);
        }
    }
    ...
}

Wrap gcc/ld/make to tweak build parameters

If you ever find yourself with little time in front of a stubborn build system and, no matter what you try, you can't get the right flags to the compiler, think about putting something (a wrapper) between the build system and the compiler. Example for g++:

#!/bin/bash
main() {
    # Build up arg[] array with all options to be passed
    # to subcommand.
    i=0
    for opt in "$@"; do
        case "$opt" in
        -O2) ;; # Removes this option
        *)
            arg[i]="$opt" # Keeps the others
            i=$((i+1))
            ;;
        esac
    done
    EXTRA_FLAGS="-O0" # Adds extra option
    echo "g++ ${EXTRA_FLAGS} ${arg[@]}" # >> /tmp/build.log # Logs the command
    /usr/bin/ccache g++ ${EXTRA_FLAGS} "${arg[@]}" # Runs the command
}
main "$@"

Make sure that the wrappers appear earlier than the real commands in your PATH.

The make wrapper can also call remake instead. Remake is fully compatible with make but has features to help debugging compilation and makefile errors.

Analyze the structure of MP4 data

The ISOBMFF Box Structure Viewer online tool allows you to upload an MP4 file and explore its structure.

0 Add to favourites0 Bury

18 May 2021 6:00am GMT

13 May 2021

feedPlanet Maemo

Mould King 13106 Forklift Report

YouTube Video

I got myself the MouldKing Forklift MOC set and wanted to share my findings with you.

The set comes with "New PowerModule 4.0″, which means it supports continuous output. If you get the new joystick controller or use the app, you can have smooth controls of the motors and not just binary 0% or 100% throttle as with the standard remote.

As you can see, I actually put on some of the stickers. Some purists never do anything like this, because they argue that after some time the stickers start peeling off and look used. This is certainly a good point if you are building a sports-car - with a Forklift however, I would argue used stickers add to the looks.

Manual errata & comments

Generally, I prefer the Mould King manual to the original by Kevin Moo as I like renderings more than photographs. However, its nice to have the original at hand if something looks fishy. While building, I noticed the following:

Step 34: Cable-management is almost completely skipped in the manual. I laid all cables through the opening behind the threads. This keeps them out of the way later. The fiddle through the cables of the motors, that you add at steps 52 & 55.

Step 100: The battery-box position is wrong. It will collide with the bar you added at step 96. To make it fit, just rotate the battery-box by 180°.

Also, the direction of motor A has to be reversed. Press and hold left-shoulder, up and down for 3 seconds for this.

Step 111: The arms that you added in steps 89/ 90 should be oriented upwards to hold the footstep.

Step 143: Use a black bush instead of the 2-pin-axle beam, so things look symmetrical. This is a leftover from the original MOC, which squashed the IR receiver in there.

Step 214: I suggest using gray 2-axles at step 230 instead of the suggested whites. This way the front facing axes will be all gray. For this just use white 2-axles here. Those wont be visible at all anyway.

Step 277: When adding the fork to the lift-arm, make sure that it has as much play as possible. Otherwise the fork will get stuck when moved all the way up.

Step 278: Do not fix the threads yet. Wait until the end so you can correctly measure the lowest position of the fork (which gives you the length of the threads).

Step 286: Make sure that the 3-pin pops out towards the 8-axle. This will make joining things at step 288 much easier.

0 Add to favourites0 Bury

13 May 2021 5:11pm GMT

11 May 2021

feedPlanet Maemo

GStreamer WebKit debugging by instrumenting source code (3/3)

This is the last post on the instrumenting source code series. I hope you to find the tricks below as useful as the previous ones.

In this post I show some more useful debugging tricks. Don't forget to have a look at the other posts of the series:

Finding memory leaks in a RefCounted subclass

The source code shown below must be placed in the .h where the class to be debugged is defined. It's written in a way that doesn't need to rebuild RefCounted.h, so it saves a lot of build time. It logs all refs, unrefs and adoptPtrs, so that any anomaly in the refcounting can be traced and investigated later. To use it, just make your class inherit from LoggedRefCounted instead of RefCounted.

Example output:

void WTF::adopted(WTF::LoggedRefCounted<T>*) [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1
void WTF::adopted(WTF::LoggedRefCounted<T>*) [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1
^^^ Two adopts, this is not good.
void WTF::LoggedRefCounted<T>::ref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1 --> ...
void WTF::LoggedRefCounted<T>::ref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount ... --> 2
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 2 --> ...
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount ... --> 1
void WTF::adopted(WTF::LoggedRefCounted<T>*) [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1 --> ...
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1 --> ...
^^^ Two recursive derefs, not good either.
#include "Logging.h"

namespace WTF {

template<typename T> class LoggedRefCounted : public WTF::RefCounted<T> {
    WTF_MAKE_NONCOPYABLE(LoggedRefCounted); WTF_MAKE_FAST_ALLOCATED;
public:
    void ref() {
        printf("%s: this=%p, refCount %d --> ...\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
        WTF::RefCounted<T>::ref();
        printf("%s: this=%p, refCount ... --> %d\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
    }

    void deref() {
        printf("%s: this=%p, refCount %d --> ...\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
        WTF::RefCounted<T>::deref();
        printf("%s: this=%p, refCount ... --> %d\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
    }

protected:
    LoggedRefCounted() { }
    ~LoggedRefCounted() { }
};

template<typename T> inline void adopted(WTF::LoggedRefCounted<T>* object)
{
    printf("%s: this=%p, refCount %d\n", __PRETTY_FUNCTION__, object, (object)?object->refCount():0); fflush(stdout);
    adopted(static_cast<RefCountedBase*>(object));
}

} // Namespace WTF

Pause WebProcess on launch

WebProcessMainGtk and WebProcessMainWPE will sleep for 30 seconds if a special environment variable is defined:

export WEBKIT2_PAUSE_WEB_PROCESS_ON_LAUNCH=1

It only works #if ENABLE(DEVELOPER_MODE), so you might want to remove those ifdefs if you're building in Release mode.

Log tracers

In big pipelines (e.g. playbin) it can be very hard to find what element is replying to a query or handling an event. Even using gdb can be extremely tedious due to the very high level of recursion. My coworker Alicia commented that using log tracers is more helpful in this case.

GST_TRACERS=log enables additional GST_TRACE() calls all accross GStreamer. The following example logs entries and exits into the query function.

GST_TRACERS=log GST_DEBUG='query:TRACE'

The names of the logging categories are somewhat inconsistent:

The log tracer code is in subprojects/gstreamer/plugins/tracers/gstlog.c.

0 Add to favourites0 Bury

11 May 2021 6:00am GMT

07 May 2021

feedPlanet Maemo

Comparing water filters

Lets say, you want to reduce the water carbonate hardness because you got a shiny coffee machine and descaling that is a time-consuming mess.

If you dont happen to run a coffee-shop, using a water-jug is totally sufficient for this. Unfortunately, while the jug itself is quite cheap, the filters you need will cost you an arm and a leg - similar to how the printer-ink business works.

The setup

Here, we want to look at the different filter options and compare their performance. The contenders are

Name Pricing
Brita Classic ~15.19 €
PearlCo Classic 12.90 €
PearlCo Protect+ 15.90 €

As said initially, the primary goal of using these filters is to reduce the water carbonate - any other changes, like pH mythology, will not be considered.

To measure the performance in this regard, I am using a digital total dissolved solid meter - just like the one used in the Wikipedia article. To make the measurement robust against environmental variations, I am not only measuring the PPM in the filtered water, but also in the tap water before filtering. The main indicator is then the reduction factor.

Also, you are not using the filter only once, so I repeat the measuring over the course of 37 days. Why 37? Well, most filters are specified for 30 days of usage - but I want to see how much cushion we got there.

So - without further ado - the results:

Results

Name Ø PPM reduction Ø absolute PPM
Brita Classic 31% 206
PearlCo Classic 24% 218
PearlCo Protect+ 32% 191

As motivated above, the difference in absolute PPM can be explained by environmental variation - after all the measurements took place over the course of more than 3 months.

However, we see that the pricing difference is indeed reflected by filtering performance. By paying ~20% more, you get a ~30% higher PPM reduction.

The only thing missing, is the time-series to see beyond 30 days:

As you can see, the filtering performance is continuously declining after a peak at about 10-15 days of use.

And for completeness, the absolute PPM values:

0 Add to favourites0 Bury

07 May 2021 6:08pm GMT

04 May 2021

feedPlanet Maemo

GStreamer WebKit debugging by instrumenting source code (2/3)

In this post I show some more useful debugging tricks. Check also the other posts of the series:

Print current thread id

The thread id is generated by Linux and can take values higher than 1-9, just like PIDs. This thread number is useful to know which function calls are issued by the same thread, avoiding confusion between threads.

#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

printf("%s [%d]\n", __PRETTY_FUNCTION__, syscall(SYS_gettid));
fflush(stdout);

Debug GStreamer thread locks

We redefine the GST_OBJECT_LOCK/UNLOCK/TRYLOCK macros to print the calls, compare locks against unlocks, and see who's not releasing its lock:

#include "wtf/Threading.h"
#define GST_OBJECT_LOCK(obj) do { \
  printf("### [LOCK] %s [%p]\n", __PRETTY_FUNCTION__, &Thread::current()); fflush(stdout); \
  g_mutex_lock(GST_OBJECT_GET_LOCK(obj)); \
} while (0)
#define GST_OBJECT_UNLOCK(obj) do { \
  printf("### [UNLOCK] %s [%p]\n", __PRETTY_FUNCTION__, &Thread::current()); fflush(stdout); \
  g_mutex_unlock(GST_OBJECT_GET_LOCK(obj)); \
} while (0)
#define GST_OBJECT_TRYLOCK(obj) ({ \
  gboolean result = g_mutex_trylock(GST_OBJECT_GET_LOCK(obj)); \
  if (result) { \
   printf("### [LOCK] %s [%p]\n", __PRETTY_FUNCTION__, &Thread::current()); fflush(stdout); \
  } \
  result; \
})

Warning: The statement expression that allows the TRYLOCK macro to return a value will only work on GCC.

There's a way to know which thread has taken a lock in glib/GStreamer using gdb. First locate the stalled thread:

(gdb) thread
(gdb) bt
#2  0x74f07416 in pthread_mutex_lock ()
#3  0x7488aec6 in gst_pad_query ()
#4  0x6debebf2 in autoplug_query_allocation ()

(gdb) frame 3
#3  0x7488aec6 in gst_pad_query (pad=pad@entry=0x54a9b8, ...)
4058        GST_PAD_STREAM_LOCK (pad);

Now get the process id (PID) and use the pthread_mutex_t structure to print the Linux thread id that has acquired the lock:

(gdb) call getpid()
$30 = 6321
(gdb) p ((pthread_mutex_t*)pad.stream_rec_lock.p)->__data.__owner
$31 = 6368
(gdb) thread find 6321.6368
Thread 21 has target id 'Thread 6321.6368'

Trace function calls (poor developer version)

If you're using C++, you can define a tracer class. This is for webkit, but you get the idea:

#define MYTRACER() MyTracer(__PRETTY_FUNCTION__);
class MyTracer {
public:
    MyTracer(const gchar* functionName)
      : m_functionName(functionName) {
      printf("### %s : begin %d\n", m_functionName.utf8().data(), currentThread()); fflush(stdout);
    }
    virtual ~MyTracer() {
        printf("### %s : end %d\n", m_functionName.utf8().data(), currentThread()); fflush(stdout);
    }
private:
    String m_functionName;
};

And use it like this in all the functions you want to trace:

void somefunction() {
  MYTRACER();
  // Some other code...
}

The constructor will log when the execution flow enters into the function and the destructor will log when the flow exits.

Setting breakpoints from C

In the C code, just call raise(SIGINT) (simulate CTRL+C, normally the program would finish).

And then, in a previously attached gdb, after breaking and having debugging all you needed, just continue the execution by ignoring the signal or just plainly continuing:

(gdb) signal 0
(gdb) continue

There's a way to do the same but attaching gdb after the raise. Use raise(SIGSTOP) instead (simulate CTRL+Z). Then attach gdb, locate the thread calling raise and switch to it:

(gdb) thread apply all bt
[now search for "raise" in the terminal log]
Thread 36 (Thread 1977.2033): #1 0x74f5b3f2 in raise () from /home/enrique/buildroot/output2/staging/lib/libpthread.so.0
(gdb) thread 36

Now, from a terminal, send a continuation signal: kill -SIGCONT 1977. Finally instruct gdb to single-step only the current thread (IMPORTANT!) and run some steps until all the raises have been processed:

(gdb) set scheduler-locking on
(gdb) next    // Repeat several times...

Know the name of a GStreamer function stored in a pointer at runtime

Just use this macro:

GST_DEBUG_FUNCPTR_NAME(func)

Detecting memory leaks in WebKit

RefCountedLeakCounter is a tool class that can help to debug reference leaks by printing this kind of messages when WebKit exits:

  LEAK: 2 XMLHttpRequest
  LEAK: 25 CachedResource
  LEAK: 3820 WebCoreNode

To use it you have to modify the particular class you want to debug:

0 Add to favourites0 Bury

04 May 2021 6:00am GMT

27 Apr 2021

feedPlanet Maemo

GStreamer WebKit debugging by instrumenting source code (1/3)

This is the continuation of the GStreamer WebKit debugging tricks post series. In the next three posts, I'll focus on what we can get by doing some little changes to the source code for debugging purposes (known as "instrumenting"), but before, you might want to check the previous posts of the series:

Know all the env vars read by a program by using LD_PRELOAD to intercept libc calls

// File getenv.c
// To compile: gcc -shared -Wall -fPIC -o getenv.so getenv.c -ldl
// To use: export LD_PRELOAD="./getenv.so", then run any program you want
// See http://www.catonmat.net/blog/simple-ld-preload-tutorial-part-2/

#define _GNU_SOURCE

#include <stdio.h>
#include <dlfcn.h>

// This function will take the place of the original getenv() in libc
char *getenv(const char *name) {
 printf("Calling getenv(\"%s\")\n", name);

 char *(*original_getenv)(const char*);
 original_getenv = dlsym(RTLD_NEXT, "getenv");

 return (*original_getenv)(name);
}

See the breakpoints with command example to know how to get the same using gdb. Check also Zan's libpine for more features.

Track lifetime of GObjects by LD_PRELOADing gobject-list

The gobject-list project, written by Thibault Saunier, is a simple LD_PRELOAD library for tracking the lifetime of GObjects. When loaded into an application, it prints a list of living GObjects on exiting the application (unless the application crashes), and also prints reference count data when it changes. SIGUSR1 or SIGUSR2 can be sent to the application to trigger printing of more information.

Overriding the behaviour of a debugging macro

The usual debugging macros aren't printing messages? Redefine them to make what you want:

#undef LOG_MEDIA_MESSAGE
#define LOG_MEDIA_MESSAGE(...) do { \
  printf("LOG %s: ", __PRETTY_FUNCTION__); \
  printf(__VA_ARGS__); \
  printf("\n"); \
  fflush(stdout); \
} while(0)

This can be done to enable asserts on demand in WebKit too:

#undef ASSERT
#define ASSERT(assertion) \
  (!(assertion) ? \
      (WTFReportAssertionFailure(__FILE__, __LINE__, WTF_PRETTY_FUNCTION, #assertion), \
       CRASH()) : \
      (void)0)

#undef ASSERT_NOT_REACHED
#define ASSERT_NOT_REACHED() do { \
  WTFReportAssertionFailure(__FILE__, __LINE__, WTF_PRETTY_FUNCTION, 0); \
  CRASH(); \
} while (0)

It may be interesting to enable WebKit LOG() and GStreamer GST_DEBUG() macros only on selected files:

#define LOG(channel, msg, ...) do { \
  printf("%s: ", #channel); \
  printf(msg, ## __VA_ARGS__); \
  printf("\n"); \
  fflush(stdout); \
} while (false)

#define _GST_DEBUG(msg, ...) do { \
  printf("### %s: ", __PRETTY_FUNCTION__); \
  printf(msg, ## __VA_ARGS__); \
  printf("\n"); \
  fflush(stdout); \
} while (false)

Note all the preprocessor trickery used here:

Print the compile-time type of an expression

Use typeid(<expression>).name(). Filter the ouput through c++filt -t:

std::vector<char *> v;
printf("Type: %s\n", typeid(v.begin()).name());

Abusing the compiler to know all the places where a function is called

If you want to know all the places from where the GstClockTime toGstClockTime(float time) function is called, you can convert it to a template function and use static_assert on a wrong datatype like this (in the .h):

template <typename T = float> GstClockTime toGstClockTime(float time) {
  static_assert(std::is_integral<T>::value,
    "Don't call toGstClockTime(float)!");
  return 0;
}

Note that T=float is different to integer (is_integral). It has nothing to do with the float time parameter declaration.

You will get compile-time errors like this on every place the function is used:

WebKitMediaSourceGStreamer.cpp:474:87:   required from here
GStreamerUtilities.h:84:43: error: static assertion failed: Don't call toGstClockTime(float)!

Use pragma message to print values at compile time

Sometimes is useful to know if a particular define is enabled:

#include <limits.h>

#define _STR(x) #x
#define STR(x) _STR(x)

#pragma message "Int max is " STR(INT_MAX)

#ifdef WHATEVER
#pragma message "Compilation goes by here"
#else
#pragma message "Compilation goes by there"
#endif

...

The code above would generate this output:

test.c:6:9: note: #pragma message: Int max is 0x7fffffff
 #pragma message "Int max is " STR(INT_MAX)
         ^~~~~~~
test.c:11:9: note: #pragma message: Compilation goes by there
 #pragma message "Compilation goes by there"
         ^~~~~~~

0 Add to favourites0 Bury

27 Apr 2021 6:00am GMT

20 Apr 2021

feedPlanet Maemo

GStreamer WebKit debugging tricks using GDB (2/2)

This post is a continuation of a series of blog posts about the most interesting debugging tricks I've found while working on GStreamer WebKit on embedded devices. These are the other posts of the series published so far:

Print corrupt stacktraces

In some circumstances you may get stacktraces that eventually stop because of missing symbols or corruption (?? entries).

#3  0x01b8733c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

However, you can print the stack in a useful way that gives you leads about what was next in the stack:

You may want to enable asm-demangle: set print asm-demangle

Example output, the 3 last lines give interesting info:

0x7ef85550:     0x1b87400       0x2     0x0     0x1b87400
0x7ef85560:     0x0     0x1b87140       0x1b87140       0x759e88a4
0x7ef85570:     0x1b87330       0x759c71a9 <gst_base_sink_change_state+956>     0x140c  0x1b87330
0x7ef85580:     0x759e88a4      0x7ef855b4      0x0     0x7ef855b4
...
0x7ef85830:     0x76dbd6c4 <WebCore::AppendPipeline::resetPipeline()::__PRETTY_FUNCTION__>        0x4     0x3     0x1bfeb50
0x7ef85840:     0x0     0x76d59268      0x75135374      0x75135374
0x7ef85850:     0x76dbd6c4 <WebCore::AppendPipeline::resetPipeline()::__PRETTY_FUNCTION__>        0x1b7e300       0x1d651d0       0x75151b74

More info: 1

Sometimes the symbol names aren't printed in the stack memdump. You can do this trick to iterate the stack and print the symbols found there (take with a grain of salt!):

(gdb) set $i = 0
(gdb) p/a *((void**)($sp + 4*$i++))

[Press ENTER multiple times to repeat the command]

$46 = 0xb6f9fb17 <_dl_lookup_symbol_x+250>
$58 = 0xb40a9001 <g_log_writer_standard_streams+128>
$142 = 0xb40a877b <g_return_if_fail_warning+22>
$154 = 0xb65a93d5 <WebCore::MediaPlayerPrivateGStreamer::changePipelineState(GstState)+180>
$164 = 0xb65ab4e5 <WebCore::MediaPlayerPrivateGStreamer::playbackPosition() const+420>
...

Many times it's just a matter of gdb not having loaded the unstripped version of the library. /proc/<PID>/smaps and info proc mappings can help to locate the library providing the missing symbol. Then we can load it by hand.

For instance, for this backtrace:

#0  0x740ad3fc in syscall () from /home/enrique/buildroot-wpe/output/staging/lib/libc.so.6
#1  0x74375c44 in g_cond_wait () from /home/enrique/buildroot-wpe/output/staging/usr/lib/libglib-2.0.so.0
#2  0x6cfd0d60 in ?? ()

In a shell, we examine smaps and find out that the unknown piece of code comes from libgstomx:

$ cat /proc/715/smaps
...
6cfc1000-6cff8000 r-xp 00000000 b3:02 785380     /usr/lib/gstreamer-1.0/libgstomx.so
...

Now we load the unstripped .so in gdb and we're able to see the new symbol afterwards:

(gdb) add-symbol-file /home/enrique/buildroot-wpe/output/build/gst-omx-custom/omx/.libs/libgstomx.so 0x6cfc1000
(gdb) bt
#0  0x740ad3fc in syscall () from /home/enrique/buildroot-wpe/output/staging/lib/libc.so.6
#1  0x74375c44 in g_cond_wait () from /home/enrique/buildroot-wpe/output/staging/usr/lib/libglib-2.0.so.0
#2  0x6cfd0d60 in gst_omx_video_dec_loop (self=0x6e0c8130) at gstomxvideodec.c:1311
#3  0x6e0c8130 in ?? ()

Useful script to prepare the add-symbol-file:

cat /proc/715/smaps | grep '[.]so' | sed -e 's/-[0-9a-f]*//' | { while read ADDR _ _ _ _ LIB; do echo "add-symbol-file $LIB 0x$ADDR"; done; }

More info: 1

The "figuring out corrupt ARM stacktraces" post has some additional info about how to use addr2line to translate memory addresses to function names on systems with a hostile debugging environment.

Debugging a binary without debug symbols

There are times when there's just no way to get debug symbols working, or where we're simply debugging on a release version of the software. In those cases, we must directly debug the assembly code. The gdb text user interface (TUI) can be used to examine the disassebled code and the CPU registers. It can be enabled with these commands:

layout asm
layout regs
set print asm-demangle

Some useful keybindings in this mode:

This screenshot shows how we can infer that an empty RefPtr is causing a crash in some WebKit code.

Wake up an unresponsive gdb on ARM

Sometimes, when you continue ('c') execution on ARM there's no way to stop it again unless a breakpoint is hit. But there's a trick to retake the control: just send a harmless signal to the process.

kill -SIGCONT 1234

Know which GStreamer thread id matches with each gdb thread

Sometimes you need to match threads in the GStreamer logs with threads in a running gdb session. The simplest way is to ask it to GThread for each gdb thread:

(gdb) set output-radix 16
(gdb) thread apply all call g_thread_self()

This will print a list of gdb threads and GThread*. We only need to find the one we're looking for.

Generate a pipeline dump from gdb

If we have a pointer to the pipeline object, we can call the function that dumps the pipeline:

(gdb) call gst_debug_bin_to_dot_file_with_ts((GstBin*)0x15f0078, GST_DEBUG_GRAPH_SHOW_ALL, "debug")

0 Add to favourites0 Bury

20 Apr 2021 6:00am GMT

13 Apr 2021

feedPlanet Maemo

GStreamer WebKit debugging tricks using GDB (1/2)

I've been developing and debugging desktop and mobile applications on embedded devices over the last decade or so. The main part of this period I've been focused on the multimedia side of the WebKit ports using GStreamer, an area that is a mix of C (glib, GObject and GStreamer) and C++ (WebKit).

Over these years I've had to work on ARM embedded devices (mobile phones, set-top-boxes, Raspberry Pi using buildroot) where most of the environment aids and tools we take for granted on a regular x86 Linux desktop just aren't available. In these situations you have to be imaginative and find your own way to get the work done and debug the issues you find in along the way.

I've been writing down the most interesting tricks I've found in this journey and I'm sharing them with you in a series of 7 blog posts, one per week. Most of them aren't mine, and the ones I learnt in the begining of my career can even seem a bit naive, but I find them worth to share anyway. I hope you find them as useful as I do.

Breakpoints with command

You can break on a place, run some command and continue execution. Useful to get logs:

break getenv
command
 # This disables scroll continue messages
 # and supresses output
 silent
 set pagination off
 p (char*)$r0
continue
end

break grl-xml-factory.c:2720 if (data != 0)
command
 call grl_source_get_id(data->source)
 # $ is the last value in the history, the result of
 # the previous call
 call grl_media_set_source (send_item->media, $)
 call grl_media_serialize_extended (send_item->media,
  GRL_MEDIA_SERIALIZE_FULL)
 continue
end

This idea can be combined with watchpoints and applied to trace reference counting in GObjects and know from which places the refcount is increased and decreased.

Force execution of an if branch

Just wait until the if chooses a branch and then jump to the other one:

6 if (i > 3) {
(gdb) next
7 printf("%d > 3\n", i);
(gdb) break 9
(gdb) jump 9
9 printf("%d <= 3\n", i);
(gdb) next
5 <= 3

Debug glib warnings

If you get a warning message like this:

W/GLib-GObject(18414): g_object_unref: assertion `G_IS_OBJECT (object)' failed

the functions involved are: g_return_if_fail_warning(), which calls to g_log(). It's good to set a breakpoint in any of the two:

break g_log

Another method is to export G_DEBUG=fatal_criticals, which will convert all the criticals in crashes, which will stop the debugger.

Debug GObjects

If you want to inspect the contents of a GObjects that you have in a reference…

(gdb) print web_settings
$1 = (WebKitWebSettings *) 0x7fffffffd020

you can dereference it…

(gdb) print *web_settings
$2 = {parent_instance = {g_type_instance = {g_class = 0x18}, ref_count = 0, qdata = 0x0}, priv = 0x0}

even if it's an untyped gpointer…

(gdb) print user_data
(void *) 0x7fffffffd020
(gdb) print *((WebKitWebSettings *)(user_data))
{parent_instance = {g_type_instance = {g_class = 0x18}, ref_count = 0, qdata = 0x0}, priv = 0x0}

To find the type, you can use GType:

(gdb) call (char*)g_type_name( ((GTypeInstance*)0x70d1b038)->g_class->g_type )
$86 = 0x2d7e14 "GstOMXH264Dec-omxh264dec"

Instantiate C++ object from gdb

(gdb) call malloc(sizeof(std::string))
$1 = (void *) 0x91a6a0
(gdb) call ((std::string*)0x91a6a0)->basic_string()
(gdb) call ((std::string*)0x91a6a0)->assign("Hello, World")
$2 = (std::basic_string<char, std::char_traits<char>, std::allocator<char> > &) @0x91a6a0: {static npos = <optimized out>, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x91a6f8 "Hello, World"}}
(gdb) call SomeFunctionThatTakesAConstStringRef(*(const std::string*)0x91a6a0)

See: 1 and 2

0 Add to favourites0 Bury

13 Apr 2021 10:49am GMT

11 Feb 2021

feedPlanet Maemo

reBounce - softfp-to-hardfp LD_PRELOAD hack for Bounce on N9

This depends on Bounce (the N900 .deb) and SDL 1.2 being installed. Google "bounce_1.0.0_armel.deb" for the former, and use n9repomirror for the latter.

This is available on OpenRepos: https://openrepos.net/content/thp/rebounce
Also on TMO: https://talk.maemo.org/showthread.php?t=101160
Also on Twitter: https://twitter.com/thp4/status/1359758278620758019

Source code (copy'n'paste this into a file named "rebounce.c", then run it using your shell):

#if 0

gcc -Os -shared -fPIC -lSDL -o librebounce.so rebounce.c

LD_PRELOAD=$(pwd)/librebounce.so /opt/bounce/bin/bounce

exit 0

#endif


/**

* reBounce -- softfp-to-hardfp LD_PRELOAD hack for Bounce on N9

*

* Bounce was a really nice 2009 tech demo on the N900. This

* makes this tech demo work on an N9 by translating the calls

* that use floating point arguments to the hardfp ABI. It also

* fixes input using SDL1.2 to get sensor values from sensorfw.

*

* Known issues: Audio is muted on startup until mute is toggled.

*

* 2021-02-11 Thomas Perl <m@thp.io>

**/


#define _GNU_SOURCE


#include <stdio.h>

#include <dlfcn.h>

#include <pthread.h>

#include <SDL/SDL.h>


#define SFP __attribute__((pcs("aapcs")))


typedef unsigned int GLuint;


static void *

sensor_thread(void *user_data)

{

SDL_Init(SDL_INIT_JOYSTICK | SDL_INIT_VIDEO);


SDL_Joystick *accelerometer = SDL_JoystickOpen(0);


while (1) {

SDL_JoystickUpdate();


float x = 0.053888f * SDL_JoystickGetAxis(accelerometer, 0);

float y = 0.053888f * SDL_JoystickGetAxis(accelerometer, 1);

float z = 0.053888f * SDL_JoystickGetAxis(accelerometer, 2);


FILE *out = fopen("/dev/shm/bounce.sensor.tmp", "wb");

fprintf(out, "%f %f %f\n", -y, x, z);

fclose(out);

rename("/dev/shm/bounce.sensor.tmp", "/dev/shm/bounce.sensor");


SDL_Delay(10);

}


return NULL;

}


FILE *

fopen(const char *filename, const char *mode)

{

FILE *(*fopen_orig)(const char *, const char *) = dlsym(RTLD_NEXT, "fopen");

if (strcmp(filename, "/sys/class/i2c-adapter/i2c-3/3-001d/coord") == 0) {

static int sensor_inited = 0;

if (!sensor_inited) {

sensor_inited = 1;


pthread_t thread;

pthread_create(&thread, NULL, sensor_thread, NULL);

}


filename = "/dev/shm/bounce.sensor";

}


return fopen_orig(filename, mode);

}


#define f_f(name) float SFP name(float x) { return ((float (*)(float))dlsym(RTLD_NEXT, #name))(x); }

#define d_d(name) double SFP name(double x) { return ((double (*)(double))dlsym(RTLD_NEXT, #name))(x); }

#define f_ff(name) float SFP name(float x, float y) { return ((float (*)(float, float))dlsym(RTLD_NEXT, #name))(x, y); }

#define d_dd(name) double SFP name(double x, double y) { return ((double (*)(double, double))dlsym(RTLD_NEXT, #name))(x, y); }


f_f(sinhf) f_f(coshf) f_f(tanhf) f_f(asinf) f_f(acosf) f_f(atanf) f_f(sinf) f_f(cosf) f_f(tanf) f_f(expf) f_f(logf)

f_f(log10f) f_f(ceilf) f_f(floorf) d_d(log) d_d(sin) f_ff(atan2f) f_ff(fmodf) d_dd(atan2) d_dd(pow) d_dd(fmod)


double SFP

frexp(double value, int *exp)

{

return ((double (*)(double, int *))dlsym(RTLD_NEXT, "frexp"))(value, exp);

}


double SFP

ldexp(double x, int n)

{

return ((double (*)(double, int))dlsym(RTLD_NEXT, "ldexp"))(x, n);

}


double SFP

modf(double value, double *iptr)

{

return ((double (*)(double, double *))dlsym(RTLD_NEXT, "modf"))(value, iptr);

}


void SFP

glClearColor(float r, float g, float b, float a)

{

((void (*)(float, float, float, float))dlsym(RTLD_NEXT, "glClearColor"))(r, g, b, a);

}


void SFP

glUniform4f(GLuint location, float v0, float v1, float v2, float v3)

{

((void (*)(GLuint, float, float, float, float))dlsym(RTLD_NEXT, "glUniform4f"))(location, v0, v1, v2, v3);

}


void SFP

glUniform1f(GLuint location, float v0)

{

((void (*)(GLuint, float))dlsym(RTLD_NEXT, "glUniform1f"))(location, v0);

}


0 Add to favourites0 Bury

11 Feb 2021 7:41am GMT

04 Feb 2021

feedPlanet Maemo

Ubuntu Touch porting notes for the Redmi Note 7 Pro, part 2

This is the second part of my porting odyssey; for the first part, follow this link.

The good news is that I've done some progress; the bad news is that there are still plenty of issues, and solving them involves deep diving into nearly all components and technologies that make up the core of an Android device, so completing the porting is going to take quite some time. On the other hand, I'm learning a lot of new stuff, and I might be able to share it by writing some documentation. And, who knows, maybe work on some other device port.

Anyway, enough with the introduction! Let's see what progress I've been doing so far.

The new device tree

While asking for help to debug the audio issue I was facing (more about that later), I was also told that the lavender tree, which I was using as a reference, was obsolete. The new one was in gitlab, and was build with a totally different system, described here.

So, I picked the lavender tree and adapted it for violet: I changed the deviceinfo file to point to my kernel tree, use my kernel configuration, and use the same boot command line as before. By the way, here's my current "new" device tree. The build failed, with errors like:

In function 'memcpy',
    inlined from 'proc_ipc_auto_msgmni.part.1' at /home/mardy/projects/port/xiaomi-violet/bd/downloads/android_kernel_xiaomi_violet/ipc/ipc_sysctl.c:82:2:
/home/mardy/projects/port/xiaomi-violet/bd/downloads/android_kernel_xiaomi_violet/include/linux/string.h:340:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter
    __read_overflow2();
    ^

I then replayed the kernel build using the old system, and noticed that it was using clang as the compiler; so I changed the related flag in the deviceinfo file, and the build went past that point. It failed later, though:

/home/mardy/projects/port/xiaomi-violet/bd/downloads/android_kernel_xiaomi_violet/drivers/staging/qcacld-3.0/core/sap/src/sap_fsm.c:1498:26: error: cast to smaller integer type 'eSapStatus' from 'void *' [-Werror,-Wvoid-pointer-to-enum-cast]
                bss_complete->status = (eSapStatus) context;
                                       ^~~~~~~~~~~~~~~~~~~~

I ended up editing the Kbuild file in the module source directory, and removed the -Werror from there (as well in another place that failed a bit later). This made the build proceed until the end, where it failed because the device tree file was not found:

+ cp -v /home/mardy/projects/port/xiaomi-violet/bd/downloads/KERNEL_OBJ/arch/arm64/boot/dts/qcom/sm8150-mtp-overlay.dtbo /home/mardy/projects/port/xiaomi-violet/bd/tmp/partitions/dtbo.img
cp: cannot stat '/home/mardy/projects/port/xiaomi-violet/bd/downloads/KERNEL_OBJ/arch/arm64/boot/dts/qcom/sm8150-mtp-overlay.dtbo': No such file or directory

I quickly realized that this was due to an error of mine: the Xiaomi violet is a sm6150, not a sm8150 as mentioned in my deviceinfo file! But what overlay file should I use then, since there isn't a sm6150-mtp-overlay.dts in my source tree? Having a loot at other deviceinfo files, I saw that the deviceinfo_kernel_dtb_overlay line is not always there, so I tried commenting it out, and it helped.

The next step was getting flashable images, of course. While the device is not supported by the UBports system-image server, we can use use some scripts to create a fake OTA (Over The Air update) and generate flashable images starting from it. The steps can be read in the devel-flashable target of the .gitlab-ci.yml file (if this step is not present, we should try to find another device repository which has it. They are the following:

./build/prepare-fake-ota.sh out/device_violet.tar.xz ota
./build/system-image-from-ota.sh ota/ubuntu_command out

Once these commands have completed their execution, these commands will push the images to the device:

fastboot flash boot out/boot.img; fastboot flash system out/system.img

There's also fastboot flash recovery out/recovery.img, but I left it out since I was not interested in the recovery image at this stage. And unless you are ready to submit your port for inclusion into the "supported devices" list, I'd recommend not flashing the UT recovery image since TWRP is more powerful and will likely help you in recover your device from a broken image.

Kernel command line

It's important that the kernel command line contains the systempart parameter. In my case, the first boot failed because the command line was longer than 512 bytes, and this parameter was getting truncated. So one thing to be careful about is the length of the kernel command line.

This was fixed by removing some unnecessary kernel parameters from the deviceinfo file.

Missing thumbnails in the Gallery app, content-hub not working

Another issue I noticed is that photo thumbnails were all black in the Gallery app, and the content-hub also was not working. I noticed a relevant commit in the lavender kernel tree and found a launchpad bug which mentioned the issues I was seeing. In the Telegram channel I was told that patch is a forward port of a commit from kernel 3.4 that was present in all of the cores devices, and that it was indeed needed to have the ContentHub working.

The patch did not apply cleanly on top of my kernel tree, but luckily it was just an offset issue: adjusting the patch was easy, and indeed after applying it thumbnails started appearing in the Gallery and Imaginario could import photos again via the ContentHub.

Time and date

Time was always reset to a far away date after a reboot. This is a common issue on Qualcomm devices, and can be fixed by disabling the time service in the Android container.

https://github.com/Halium/android_device_halium_halium_arm64/pull/3

For some reason, at a certain point my override stopped working (or, more likely, the override never worked, but I happened to have fixed the issue directly modifying the vendor init file). I had to copy /android/vendor/etc/init/hw/init.qcom.rc into /usr/share/halium-overrides/, modify it and bind mount the modified file in order to get it working.

This actually seems to match my understanding of the Android init documentation, because according to the path priorities a configuration file stored under /system/ will never be able to override one stored under /vendor.

Fixing the audio configuration

Audio was not working at all. The sound indicator icon was not shown, only the raw (unstranslated) "indicator-sound" text was shown in the panel. pulseaudio was not running. Trying to run it manually (as the phablet user, since pulseaudio is run in the user session) led to this:

phablet@ubuntu-phablet:~$ pulseaudio -n -vvv -F /etc/pulse/touch-android9.pa
I: [pulseaudio] main.c: setrlimit(RLIMIT_NICE, (31, 31)) failed: Operation not permitted
I: [pulseaudio] main.c: setrlimit(RLIMIT_RTPRIO, (9, 9)) failed: Operation not permitted
...
D: [pulseaudio] cli-command.c: Parsing script '/etc/pulse/touch-android9.pa'
D: [pulseaudio] database-tdb.c: Opened TDB database '/home/phablet/.config/pulse/ubuntu-phablet-stream-volumes.tdb'
I: [pulseaudio] module-stream-restore.c: Successfully opened database file '/home/phablet/.config/pulse/ubuntu-phablet-stream-volumes'.
...
D: [pulseaudio] module.c: Checking for existence of '/usr/lib/pulse-8.0/modules/module-droid-card-28.so': success
I: [pulseaudio] module-droid-card.c: Create new droid-card
D: [pulseaudio] droid-util.c: No configuration provided for opening module with id primary
I: [pulseaudio] config-parser-xml.c: Failed to open file (/odm/etc/audio_policy_configuration.xml): No such file or directory
D: [pulseaudio] droid-config.c: Failed to parse configuration from /odm/etc/audio_policy_configuration.xml
D: [pulseaudio] config-parser-xml.c: Read /vendor/etc/audio/audio_policy_configuration.xml ...
D: [pulseaudio] config-parser-xml.c: New module: "primary"
W: [pulseaudio] config-parser-xml.c: [/vendor/etc/audio/audio_policy_configuration.xml:78] Could not find element attribute "samplingRates"
E: [pulseaudio] config-parser-xml.c: [/vendor/etc/audio/audio_policy_configuration.xml:78] Failed to parse element <profile>
E: [pulseaudio] config-parser-xml.c: parsing aborted at line 78
D: [pulseaudio] droid-config.c: Failed to parse configuration from /vendor/etc/audio/audio_policy_configuration.xml
D: [pulseaudio] config-parser-xml.c: Read /vendor/etc/audio_policy_configuration.xml ...
D: [pulseaudio] config-parser-xml.c: New module: "primary"
W: [pulseaudio] config-parser-xml.c: [/vendor/etc/audio_policy_configuration.xml:78] Could not find element attribute "samplingRates"
E: [pulseaudio] config-parser-xml.c: [/vendor/etc/audio_policy_configuration.xml:78] Failed to parse element <profile>
E: [pulseaudio] config-parser-xml.c: parsing aborted at line 78
D: [pulseaudio] droid-config.c: Failed to parse configuration from /vendor/etc/audio_policy_configuration.xml
I: [pulseaudio] config-parser-legacy.c: Failed to open config file (/vendor/etc/audio_policy.conf): No such file or directory
D: [pulseaudio] droid-config.c: Failed to parse configuration from /vendor/etc/audio_policy.conf
I: [pulseaudio] config-parser-xml.c: Failed to open file (/system/etc/audio_policy_configuration.xml): No such file or directory
D: [pulseaudio] droid-config.c: Failed to parse configuration from /system/etc/audio_policy_configuration.xml
I: [pulseaudio] config-parser-legacy.c: Failed to open config file (/system/etc/audio_policy.conf): No such file or directory
D: [pulseaudio] droid-config.c: Failed to parse configuration from /system/etc/audio_policy.conf
E: [pulseaudio] droid-config.c: Failed to parse any configuration.
...

Indeed, line 78 of /vendor/etc/audio_policy_configuration.xml had an error, where a property was spelt like simplingRate instead of samplingRate. However, the "vendor" partition is read-only, so I couldn't change that file directly. Another option could have been creating a fixed copy of the file and place it with /system/etc/audio_policy_configuration.xml, but the "system" is also read-only (there are ways to modify these partitions, of course, but I couldn't find a clean way to do it from the device tree scripts. So I went for the bind-mount approach: I would ship the fixed file in some other directory of the file-system, and then modify the /etc/init/mount-android.conf file (this is the job that upstart executes before starting the Android LXC container) to bind-mount the file onto /vendor/etc/audio_policy_configuration.xml.

This worked, but my joy was short-lived: audio was coming up only once every 5 boots or so. I will not list here all the things I tried, as they were plenty of them; and more than once I went to sleep happy and convinced of having fixed the issue for good, until the next day the device booted without audio. It was clearly a timing issue occurring in the early boot, because one thing I clearly noticed very early on is that in those cases when the audio was booting, the following lines appeared in the kernel log:

[    7.130057] wcd937x_codec wcd937x-codec: msm_cdc_dt_parse_vreg_info: cdc-vdd-ldo-rxtx: vol=[1800000 1800000]uV, curr=[25000]uA, ond 0
[    7.130068] wcd937x_codec wcd937x-codec: msm_cdc_dt_parse_vreg_info: cdc-vddpx-1: vol=[1800000 1800000]uV, curr=[10000]uA, ond 0
[    7.130076] wcd937x_codec wcd937x-codec: msm_cdc_dt_parse_vreg_info: cdc-vdd-mic-bias: vol=[3296000 3296000]uV, curr=[25000]uA, ond 0
[    7.130084] wcd937x_codec wcd937x-codec: msm_cdc_dt_parse_vreg_info: cdc-vdd-buck: vol=[1800000 1800000]uV, curr=[650000]uA, ond 1
[    7.137759] wcd937x_codec wcd937x-codec: bound wcd937x-slave.1170224 (ops cleanup_module [wcd937x_slave_dlkm])
[    7.138065] wcd937x_codec wcd937x-codec: bound wcd937x-slave.1170223 (ops cleanup_module [wcd937x_slave_dlkm])

I started adding some printk to the kernel driver, and modified it slightly to register itself with the module_driver() macro instead of the simpler, but logless, module_platform_driver(). This showed that the driver was always loaded at about 7 seconds, but the platform_driver_register() method only called the driver's bind method (wcd937x_bind()) in those boots where audio was working.

After more debugging into platform_driver_register(), I stumbled upon the platform_match() function, added some debugging message in there to print the device name and the driver name, and observed how in those boots where audio was failing this function was called to find a driver for the wcd937x_codec device before the wcd937x_codec driver (provided by the wcd937x_dlmk module) was available. So, I tried adding wcd937x_dlmk to /etc/modules-load.d/modules.conf and this caused the driver to be loaded at about 3 seconds, and apparently fixed the audio issue. At least, till the time of writing this, I never had my phone boot without audio anymore.

Not all is fine with the audio, unfortunately: the mic records a background noise along with the actual sounds, and the recording volume is quite low. This also affects the call quality. On the other hand, the noise disappears when recording happens via the earphones. But I've yet to investigate this; I hope to give you some updates in part three.

0 Add to favourites0 Bury

04 Feb 2021 8:34pm GMT

17 Jan 2021

feedPlanet Maemo

Ubuntu Touch porting notes for the Redmi Note 7 Pro

In case you have a sense of deja-vu when reading this post, it's because indeed this is not the first time I try porting a device to Ubuntu Touch. The previous attempt, however, was with another phone model (and manufacturer), and did not have a happy ending. This time it went better, although the real ending is still far away; but at least I have something to celebrate.

The phone

I made myself a Christmas present and bought a Xiaomi Redmi Note 7 Pro, a dual-SIM phone from 2019 with 6GB of RAM and 128GB of flash storage. To be totally honest, I bought it by mistake: the phone I really wanted to buy is the Redmi Note 7 (without the "Pro"), because it's a modern phone that is working reasonable well with Ubuntu Touch. The online shop where I bought it from let me choose some options, including the RAM size, so I chose the maximum available (6GB) without being aware that this would mean that I would be buying the "Pro" device - the shop did not alter the item name, so I couldn't really know. Unfortunately, the two versions are two rather different beasts, powered by different SoC; both are produced by Qualcomm, so they are not that different, but it's enough to make the installation of Ubuntu Touch impossible.

But between the choice of retuning the phone to the shop and begin a new porting adventure, I stood firm and went for the latter. Hopefully I won't regret it (if everything goes bad, I can still use it with LineageOS, which runs perfectly on it).

Moreover, there already exist a port of Ubuntu Touch for this phone, which actually works reasonably well (I tried it briefly, and many things were working), but the author claims to be a novice and indeed did not follow the best git practices when working on the source code, so it's hard to understand what was changed and why. But if you are looking for a quick way to get Ubuntu Touch working on this phone, you are welcome to have a look at this Telegram channel.

What follows are the raw notes of my attempts. They are here so that search engines can index the error messages and the various logs, and hopefully help someone hitting similar errors on other devices to find his way out.

Getting Halium and the device source code

Installed the dependencies like in the first step. repo was not found in the Ubuntu 20.04 archives, but I had it installed anyway due to my work on a Yocto device.

Since my device has Android 9 on it, I went for Halium 9:

repo init -u git://github.com/Halium/android.git -b halium-9.0 --depth=1
repo sync -c -j 16

The official LineageOS repository for the Xiaomi Redmi Note 7 Pro is android_device_xiaomi_violet, but the page with the official build has been taken down (some DMCA violation, if you believe the forums) and no development has been happening since last year. A more active forum thread uses another repository which seems to be receiving more frequent updates, so I chose to base my port on that.

I actually tested that LineageOS image on my phone, and verified that all the hardware was working properly.

Initially, I created forks of the relevant repository under my own gitlab account, but then I though (especially looking at the Note 7 port) that creating a group just for this port would make people's life easier, because they wouldn't need to navigate through my 1000 personal projects to find what is relevant for this port. So, I created these forks:

Next, created the halium/devices/manifests/xiaomi_violet.xml file with this content:

<?xml version="1.0" encoding="UTF-8"?>
<manifest>
    <!-- Remotes -->
    <remote name="ubuntu-touch-xiaomi-violet"
            fetch="https://gitlab.com/ubuntu-touch-xiaomi-violet"
            revision="halium-9.0"/>

    <!-- Device Tree -->
    <project path="device/xiaomi/violet"
             name="android_device_xiaomi_violet"
             remote="ubuntu-touch-xiaomi-violet" />

    <!-- Kernel -->
    <project path="kernel/xiaomi/violet"
             name="android_kernel_xiaomi_violet"
             remote="ubuntu-touch-xiaomi-violet" />

    <!-- Proprietary/Vendor blobs -->
    <project path="vendor/xiaomi/violet"
             name="android_vendor_xiaomi_violet"
             remote="ubuntu-touch-xiaomi-violet" />
</manifest>

Fetching all the sources mentioned in the manifest:

./halium/devices/setup violet

Full output:

*****************************************
I: Configuring for device xiaomi_violet
*****************************************
Fetching projects: 100% (393/393), done.
hardware/qcom/audio-caf/apq8084: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/msm8916: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/msm8952: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/msm8960: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/msm8974: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/msm8994: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/msm8996: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/msm8998: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/sdm845: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio-caf/sm8150: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/audio/default: Shared project LineageOS/android_hardware_qcom_audio found, disabling pruning.
hardware/qcom/display: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/apq8084: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/msm8916: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/msm8952: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/msm8960: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/msm8974: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/msm8994: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/msm8996: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/msm8998: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/sdm845: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/display-caf/sm8150: Shared project LineageOS/android_hardware_qcom_display found, disabling pruning.
hardware/qcom/media: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/apq8084: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/msm8916: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/msm8952: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/msm8960: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/msm8974: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/msm8994: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/msm8996: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/msm8998: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/sdm845: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/qcom/media-caf/sm8150: Shared project LineageOS/android_hardware_qcom_media found, disabling pruning.
hardware/ril: Shared project LineageOS/android_hardware_ril found, disabling pruning.
hardware/ril-caf: Shared project LineageOS/android_hardware_ril found, disabling pruning.
hardware/qcom/wlan: Shared project LineageOS/android_hardware_qcom_wlan found, disabling pruning.
hardware/qcom/wlan-caf: Shared project LineageOS/android_hardware_qcom_wlan found, disabling pruning.
hardware/qcom/bt: Shared project LineageOS/android_hardware_qcom_bt found, disabling pruning.
hardware/qcom/bt-caf: Shared project LineageOS/android_hardware_qcom_bt found, disabling pruning.
Updating files: 100% (67031/67031), done.nel/testsUpdating files:  36% (24663/67031)
lineage/scripts/: discarding 1 commits
Updating files: 100% (1406/1406), done.ineageOS/android_vendor_qcom_opensource_thermal-engineUpdating files:  47% (666/1406)
Checking out projects: 100% (392/392), done.
*******************************************
I: Refreshing device vendor repository: device/xiaomi/violet
I: Processing proprietary blob file: device/xiaomi/violet/./proprietary-files.txt
I: Processing fstab file: device/xiaomi/violet/./rootdir/etc/fstab.qcom
I: Removing components relying on SettingsLib from: device/xiaomi/violet
*******************************************

Starting the build

Setting up the environment:

$ source build/envsetup.sh
including device/xiaomi/violet/vendorsetup.sh
including vendor/lineage/vendorsetup.sh
$

Running the breakfast command:

$ breakfast violet
including vendor/lineage/vendorsetup.sh
Trying dependencies-only mode on a non-existing device tree?

============================================
PLATFORM_VERSION_CODENAME=REL
PLATFORM_VERSION=9
LINEAGE_VERSION=16.0-20210109-UNOFFICIAL-violet
TARGET_PRODUCT=lineage_violet
TARGET_BUILD_VARIANT=userdebug
TARGET_BUILD_TYPE=release
TARGET_ARCH=arm64
TARGET_ARCH_VARIANT=armv8-a
TARGET_CPU_VARIANT=kryo300
TARGET_2ND_ARCH=arm
TARGET_2ND_ARCH_VARIANT=armv8-a
TARGET_2ND_CPU_VARIANT=cortex-a75
HOST_ARCH=x86_64
HOST_2ND_ARCH=x86
HOST_OS=linux
HOST_OS_EXTRA=Linux-5.4.0-58-generic-x86_64-Ubuntu-20.04.1-LTS
HOST_CROSS_OS=windows
HOST_CROSS_ARCH=x86
HOST_CROSS_2ND_ARCH=x86_64
HOST_BUILD_TYPE=release
BUILD_ID=PQ3A.190801.002
OUT_DIR=/home/mardy/projects/port/halium/out
PRODUCT_SOONG_NAMESPACES= hardware/qcom/audio-caf/sm8150 hardware/qcom/display-caf/sm8150 hardware/qcom/media-caf/sm8150
============================================

Building the kernel

The configuration needs to be adapted for Halium. Locating the kernel config:

$ grep "TARGET_KERNEL_CONFIG" device/xiaomi/violet/BoardConfig.mk 
TARGET_KERNEL_CONFIG := vendor/violet-perf_defconfig
$

Getting the Mer checker tool and running it:

git clone https://github.com/mer-hybris/mer-kernel-check
cd mer-kernel-check
./mer_verify_kernel_config ../kernel/xiaomi/violet/arch/arm64/configs/vendor/violet-perf_defconfig

Prints lots of warnings, starting with:

WARNING: kernel version missing, some reports maybe misleading

To help the tool, we need to let him know the kernel version. It can be seen at the beginning of the kernel Makefile, located in ../kernel/xiaomi/violet/Makefile; in my case it was

VERSION = 4
PATCHLEVEL = 14
SUBLEVEL = 83
EXTRAVERSION =

So I edited the configuration file ../kernel/xiaomi/violet/arch/arm64/configs/vendor/violet-perf_defconfig and added this line at the beginning:

# Version 4.14.83

then ran the checker tool again. This time the output was a long list of kernel options that needed to be fixed, but as I went asking for some explanation in the Telegram channel for Ubuntu Touch porters, I was told that I could/should skip this test and instead use the check-kernel-config tool from Halium. I downloaded it, made it executable (chmod +x check-kernel-config) and ran it:

./check-kernel-config kernel/xiaomi/violet/arch/arm64/configs/vendor/violet-perf_defconfig
[...lots of red and green lines...]
Config file checked, found 288 errors that I did not fix.

I ran it again with the -w option, and it reportedly fixed 287 errors. Weird, does that mean that an error was still left? I ran the tool again (without -w), and it reported 2 errors. Ran it once more in write mode, and if fixed them. So, one might need to run it twice.

Next, added the line

BOARD_KERNEL_CMDLINE += console=tty0

in device/xiaomi/violet/BoardConfig.mk. One more thing that needs to be done before starting the build is fixing the mount points, so I opened device/xiaomi/violet/rootdir/etc/fstab.qcom and changed the type of the userdata partition from f2fs to ext4. There were no lines with the context option, so that was the only change I needed to do.

At this point, while browsing throught the documentation, I found a link to this page which contains some notes on Halium 9 porting, which are substantially different from the official porting guide in halium.org (which I was already told in Telegram to be obsolete in several points).

So, following the instructions from this new link I ran

hybris-patches/apply-patches.sh --mb

which completed successfully.

Then, continuing following the points from this page, I edited device/xiaomi/violet/lineage_violet.mk, commented out the lines

$(call inherit-product, $(SRC_TARGET_DIR)/product/full_base_telephony.mk)
# ...
$(call inherit-product, vendor/lineage/config/common_full_phone.mk)

and replaced the first one with a similar line pointing to halium.mk. For the record, a find revealed that the SRC_TARGET_DIR variable in my case was build/make/target/. It contained also the halium.mk file, which was created by the hybris patches before. As for removing the Java dependencies, I cound't find any modules similar to those listed in this commit in any of the makefiles in my source tree, so I just started the build:

source build/envsetup.sh && breakfast violet
make halium-boot

This failed the first time with the compiler being killed (out of memory, most likely), but it succeeded on the second run. There was a suspicious warning, though:

drivers/input/touchscreen/Kconfig:1290:warning: multi-line strings not supported

And indeed my kernel/xiaomi/violet/drivers/input/touchscreen/Kconfig had this line:

source "drivers/input/touchscreen/ft8719_touch_f7b/Kconfig

(notice the unterminated string quote). I don't know if this had any impact on the build, but just to be on the safe side I added the missing quote and rebuilt.

Building the system image

Running

make systemimage

failed pretty soon:

ninja: error: '/home/mardy/projects/port/halium/out/soong/host/linux-x86/framework/turbine.jar', needed by '/home/mardy/projects/port/halium/out/soong/.intermediates/libcore/core-oj/android_common/turbine/core-oj.jar', missing and no known rule to make it

The error is due to all the Java-related stuff that I should have disabled but couldn't find. So, I tried to have a look at the changes made on another xiaomi device (lavender, the Redmi note 7, which might not be that different, I thought) and started editing device/xiaomi/violet/device.mk and removing a couple of Android packages. Eventually the build proceeded, just to stop at a python error:

  File "build/make/tools/check_radio_versions.py", line 56
    print "*** Error opening \"%s.sha1\"; can't verify %s" % (fn, key)
          ^
SyntaxError: invalid syntax

Yes, it's the python3 vs python2 issue, since in my system python is python version 3. In order to fix it, I created a virtual environment:

virtualenv --python 2.7 ../python27  # adjust the path to your prefs
source ../python27/bin/activate

Remember that the second line must be run every time you'll need to setup the Halium build environment (that is, every time you run breakfast).

The build then proceeded for several minutes, until it failed due to some unresolved symbols:

prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/aarch64-linux-android/bin/ld.gold: error: /home/mardy/projects/port/halium/out/target/product/violet/obj/STATIC_LIBRARIES/lib_driver_cmd_qcwcn_intermediates/lib_driver_cmd_qcwcn.a: member at 7694 is not an ELF object
external/wpa_supplicant_8/hostapd/src/drivers/driver_nl80211.c:7936: error: undefined reference to 'wpa_driver_set_p2p_ps'
/home/mardy/projects/port/halium/out/target/product/violet/obj/EXECUTABLES/hostapd_intermediates/src/drivers/driver_nl80211.o:driver_nl80211.c:wpa_driver_nl80211_ops: error: undefined reference to 'wpa_driver_set_ap_wps_p2p_ie'
/home/mardy/projects/port/halium/out/target/product/violet/obj/EXECUTABLES/hostapd_intermediates/src/drivers/driver_nl80211.o:driver_nl80211.c:wpa_driver_nl80211_ops: error: undefined reference to 'wpa_driver_get_p2p_noa'
/home/mardy/projects/port/halium/out/target/product/violet/obj/EXECUTABLES/hostapd_intermediates/src/drivers/driver_nl80211.o:driver_nl80211.c:wpa_driver_nl80211_ops: error: undefined reference to 'wpa_driver_set_p2p_noa'
/home/mardy/projects/port/halium/out/target/product/violet/obj/EXECUTABLES/hostapd_intermediates/src/drivers/driver_nl80211.o:driver_nl80211.c:wpa_driver_nl80211_ops: error: undefined reference to 'wpa_driver_nl80211_driver_cmd'

As people told me in the Telegram channel, this driver is not used since "our wpa_supplicant talks to kernel directly". OK, so I simply disabled the driver, copying again from the lavender Halium changes: commented out the BOARD_WLAN_DEVICE := qcwcn line (and all lines referring this variable) from device/xiaomi/violet/BoardConfig.mk and ran make systemimage again. This time, surprisingly, it all worked.

Try it out

I first tried to flash the kernel only. I rebooted into fastboot, and on my PC ran this:

cout
fastboot flash boot halium-boot.img

I then rebooted my phone, but after showing the boot logo for a few seconds it would jump to the fastboot screen. Indeed, flashing the previous kernel would restore the normal boot, so there had to be something wrong with my own kernel.

While looking at what the problem could be, I noticed that in the lavender port the author did not modify the lineageos kernel config file in place, but instead created a new one and changed the BoardConfig.mk file to point to his new copy. Since it sounded like a good idea, I did the same and created kernel/xiaomi/violet/arch/arm64/configs/vendor/violet_halium_defconfig for the Halium changes. And then the line in the board config file became

TARGET_KERNEL_CONFIG := vendor/violet_halium_defconfig

I then continued investigating the boot issue, and I was told that it might have been due to an Android option, skip_initramfs, which is set by the bootloader and causes our Halium boot to fail. The fix is to just disable this option in the kernel, by editing init/initramfs.c and change the skip_initramfs_param function to always set the do_skip_initramfs variable to 0, rather than to 1. After doing this, the boot proceeded to show the Ubuntu splash screen with the five dots being lit, but it didn't proceed from there.

Setting up the udev rules

Even in this state, the device was detected by my host PC and these lines appeared in the system log:

kernel: usb 1-3: new high-speed USB device number 26 using xhci_hcd
kernel: usb 1-3: New USB device found, idVendor=0fce, idProduct=7169, bcdDevice= 4.14
kernel: usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
kernel: usb 1-3: Product: Unknown
kernel: usb 1-3: Manufacturer: GNU/Linux Device
kernel: usb 1-3: SerialNumber: GNU/Linux Device on usb0 10.15.19.82

Indeed, I guess the reason I could do this without even flashing my systemimage is because I had first flashed another UT system image from another porter. I'm not sure if I'd had the same results with my own image. Anyway, the USB networking was there, so I connected and ran the following commands to generate the udev rules:

ssh phablet@10.15.19.82
# used 0000 as password
sudo -i
# same password again
cd /home/phablet
DEVICE=violet
cat /var/lib/lxc/android/rootfs/ueventd*.rc /vendor/ueventd*.rc \
        | grep ^/dev \
        | sed -e 's/^\/dev\///' \
        | awk '{printf "ACTION==\"add\", KERNEL==\"%s\", OWNER=\"%s\", GROUP=\"%s\", MODE=\"%s\"\n",$1,$3,$4,$2}' \
        | sed -e 's/\r//' \
        >70-$DEVICE.rules

I then copied (with scp) this file to my host PC, and I moved it to device/xiaomi/violet/ubuntu/70-violet.rules (I had to create the ubuntu directory first). Then I edited the device/xiaomi/violet/device.mk file and added these lines at the end:

### Ubuntu Touch ###
PRODUCT_COPY_FILES += \
    $(LOCAL_PATH)/ubuntu/70-violet.rules:system/halium/lib/udev/rules.d/70-android.rules
### End Ubuntu Touch ###

It was now time to try my own system image. I rebooted my device into TWRP, I cloned the halium-install repository into my halium build dir, downloaded a rootfs for Halium9, and ran

./halium-install/halium-install -p ut \
    ~/Downloads/ubuntu-touch-android9-arm64.tar.gz \
    out/target/product/violet/system.img

The first time this failed because the simg2img tool was not installed. The second time it proceeded to create the image, asked me for a password for the phablet user (gave "0000") and pushed the image onto the device, into the /data/ partition. I then rebooted.

My first Ubuntu Touch image

Upon reboot, the usual splash screen appeared, followed by several seconds of black screen. It definitely didn't look right, but at least it proved that something had been flashed. After some more seconds, to my big surprise, the Ubuntu boot screen appeared, just with a smaller logo than how it used to be before, which also confimed that my system image was being used - in fact, I did not adjust the GRID_UNIT_PX variable before. Since this was an easy fix, I chose to focus on that, rather than fix the boot issues (indeed, my device did not move on from the Ubuntu boot screen). SSH was working.

I took the scaling.conf file from the lavender changes, put it in device/xiaomi/violet/ubuntu/ and added this line in the PRODUCT_COPY_FILES in device.mk:

$(LOCAL_PATH)/ubuntu/scaling.conf:system/halium/etc/ubuntu-touch-session.d/android.conf

I initially used system/ubuntu/etc/... as the target destination for the config file, like in the lavender commit, but this didn't work out for me. Then I changed the path to start with system/halium/, like it's mentioned in the ubports documentation, but it apparently had no effect either.

After a couple of days spent trying to understand why my files were not appearing under /etc, it turned out that the device was not using my system image at all: with halium-install I had my image installed in /userdata/system.img, while the correct path for Halium 9 devices is /userdata/android-rootfs.img. I was told that the option -s of halium-install would do the trick. Instead of re-running the script, I took the shortcut of renaming /userdata/system.img to /userdata/android-rootfs.img and after rebooting I could see that the Ubuntu logo was displayed at the correct size. And indeed my system image was being used.

So I started to debug why unity8 didn't start. The logs terminated with this line:

terminate called after throwing an instance of 'std::runtime_error'
  what():  org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
initctl: Event failed

It means, that unity8 did not handle correctly the situation where another D-Bus service crashed while handing a call from unity8. The problem now was how to figure out which service it was. I ran dbus-monitor and restarted unity8 (initctl start unity8), then examined the dbus logs; I saw the point where unity8 got disconnected from the bus, but before that point I didn't find any failed D-Bus calls. So it had to be the system bus. I did exactly the same steps, just this time after running dbus-monitor --system as root, I found the place where unity8 got disconnected, and found this D-bus error shortly before that:

method call time=1605676421.968667 sender=:1.306 -> destination=com.ubuntu.biometryd.Service serial=3 path=/default_device; interface=com.ubuntu.biometryd.Device; member=Identifier
method return time=1605676421.970222 sender=:1.283 -> destination=:1.306 serial=4 reply_serial=3
   object path "/default_device/identifier"
...
method call time=1605676421.974218 sender=:1.283 -> destination=org.freedesktop.DBus serial=6 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=GetConnectionAppArmorSecurityContext
   string ":1.306"
error time=1605676421.974278 sender=org.freedesktop.DBus -> destination=:1.283 error_name=org.freedesktop.DBus.Error.AppArmorSecurityContextUnknown reply_serial=6
   string "Could not determine security context for ':1.306'"
...
signal time=1605676421.989074 sender=org.freedesktop.DBus -> destination=:1.283 serial=6 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=NameLost
   string "com.ubuntu.biometryd.Service"
...
error time=1605676421.989302 sender=org.freedesktop.DBus -> destination=:1.306 error_name=org.freedesktop.DBus.Error.NoReply reply_serial=4
   string "Message recipient disconnected from message bus without replying"

So, it looks like unity8 (:1.306) made a request to biometryd, who asked the D-Bus daemon what was the AppArmor label of the caller, but since I didn't backport the D-Bus mediation patches for AppArmor, the label could not be resolved. biometryd decided to crash instead of properly handling the error, and so did unity8.

So I backported the AppArmor patches. I took them from the Canonical kernel repo, but they did not apply cleanly because they are meant to be applied on top of a pristine 4.14 branch, whereas the Android kernel had already some AppArmor security fixes backported from later releases. So I set and quickly inspected the contents of each patch, and found out that a couple of them had already been applied, and the last patch had to be applied as the first (yeah, it's hard to explain, but it all depends on other patches that Android has backported from newer kernels). Anyway, after rebuilding the kernel and reflashing it, my phone could finally boot to unity8!

Conclusion (of the first episode)

The actual porting, as I've been told, starts here. What has been documented here are only the very first steps of the bring-up; what awaits me now is to make all the hardware subsystems work properly, and this, according to people more experienced in porting, is the harder part.

So far, very few things work, to the point that it's faster to me to list the things that do work; it's safe to assume that all what is not listed here is not working:

  • Mir, with graphics and input: Unity8 starts and is usable
  • Camera: can take photos; video recording start but an error appears when the stop button is pressed
  • Flashlight
  • Fingerprint reader: surprisingly, this worked out of the box
  • Screen brightness (though it can be changed only manually)

For all the rest, please keep an eye on this blog: I'll write, when I make some progress!

0 Add to favourites0 Bury

17 Jan 2021 7:15pm GMT

17 Dec 2020

feedPlanet Maemo

Avast, Qt6 announcing new QPromise and QFuture APIs

Qt published its New_Features in Qt 6.0.

Some noteworthy items in their list:

I like to think I had my pirate-hook in it at least a little bit with QTBUG-61928.

I need to print this out and put it above my bed:
Thiago Macieira added a comment - 13 Jul '17 03:51
You're right
Philip Van Hoof added a comment - 13 Jul '17 07:32
Damn, and I was worried the entire morning that I had been ranting again.
Thiago Macieira added a comment - 13 Jul '17 16:06
oh, you were ranting. Doesn't mean you're wrong.
Thanks for prioritizing this Thiago.

0 Add to favourites0 Bury

17 Dec 2020 9:29pm GMT

06 Dec 2020

feedPlanet Maemo

How to generate random points on a sphere

This question often pops up, when you need a random direction vector to place things in 3D or you want to do a particle simulation.

We recall that a 3D unit-sphere (and hence a direction) is parametrized only by two variables; elevation \theta \in [0; \pi] and azimuth \varphi \in [0; 2\,\pi] which can be converted to Cartesian coordinates as

\begin{aligned} x &= \sin\theta \, \cos\varphi \\ y &= \sin\theta \, \sin\varphi \\ z &= \cos\theta \end{aligned}

If one takes the easy way and uniformly samples this parametrization in numpy like

phi = np.random.rand() * 2 * np.pi
theta = np.random.rand() * np.pi

One (i.e. you as you are reading this) ends with something like this:

While the 2D surface of polar coordinates uniformly sampled (left), we observe a bias of sampling density towards the poles when projecting to the Cartesian coordinates (right).
The issue is that the cos mapping of the elevation has an uneven step size in Cartesian space, as you can easily verify: cos^{'}(x) = sin(x).

The solution is to simply sample the elevation in the Cartesian space instead of the spherical space - i.e. sampling z \in [-1; 1]. From that we can get back to our elevation as \theta = \arccos z:

z = 1 - np.random.rand() * 2 # convert rand() range 0..1 to -1..1
theta = np.arccos(z)

As desired, this compensates the spherical coordinates such that we end up with uniform sampling in the Cartesian space:

Custom opening angle

If you want to further restrict the opening angle instead of sampling the full sphere you can also easily extend the above. Here, you must re-map the cos values from [1; -1] to [0; 2] as

cart_range = -np.cos(angle) + 1 # maximal range in cartesian coords
z = 1 - np.random.rand() * cart_range
theta = np.arccos(z)

Optimized computation

If you do not actually need the parameters \theta, \varphi, you can spare some trigonometric functions by using \sin \theta = \sqrt { 1 - z^2} as

\begin{aligned} x &= \sqrt { 1 - z^2} \, \cos\varphi \\ y &= \sqrt { 1 - z^2} \, \sin\varphi \end{aligned} 0 Add to favourites0 Bury

06 Dec 2020 1:12am GMT

18 Nov 2020

feedPlanet Maemo

Self-built NAS for Nextcloud hosting

With Google cutting its unlimited storage and ending the Play Music service, I decided to use my own Nextcloud more seriously.
In part because Google forced all its competitors out of the market, but mostly because I want to be independent of any cloudy services.

The main drawback of my existing Nextcloud setup, that I have written about here, was missing redundancy; the nice thing about putting your stuff in the cloud is that you do not notice if one of the storage devices fails - Google will take care of providing you with a backup copy of your data.

Unfortunately, the Intel NUC based build I used, while offering great power efficiency did not support adding a second HDD to create a fail-safe RAID1 setup. Therefore I had to upgrade.

As I still wanted to keep things power-efficient in a small form-factor, my choice fell on the ASRock DeskMini series. Here, I went with the AMD Variant (A300) in order to avoid paying the toll of spectre mitigations with Intel (resulting in just 80% of baseline performance).

Size comparison between the A300 and a Gigabyte BRIX

The photos above show you the size difference, which is considerable - yet necessary to cram two 2.5″ SATA drives next to each other.
Here, keep in mind that while the NUC devices have their CPU soldered on, we are getting the standardized STX form-factor with the A300, which means you can replace and upgrade the mainboard and the CPU as you wish, while with a NUC you are basically stuck with what you bought initially.

The full config of the build is as follows and totals at about 270€

Note, that I deliberately chose SSDs from different vendors to reduce the risk of simultaneous failure.
Also, while the 3000G is not the fastest AMD CPU, it is sufficient to host nextcloud and is still a nice upgrade from the Intel Celeron I used previously.
Furthermore, its 35W TDP nicely fits the constrained cooling options. Note, that you can limit for Ryzen 3/5 CPUs to 35W in the BIOS as well, so there is not need to get their GE variants.
However, for a private server you probably do not need that CPU power anyway, so just go with the Athlon 3000G for half the price.

Unfortunately, the A300 system is not designed for passive cooling and comes with a quite annoying CPU fan. To me the fan coming with the Athlon 3000G was less annoying, so I used that instead.
Anyway, you should set the fan RPM to 0% below 50° C in the BIOS, which results in 800 RPM and is unhearable while keeping the CPU reasonably cool.

Power Consumption

As the machine will run 24/7, power consumption is an important factor. As a rule of thumb you can calculate with 1€ for 1 Watt drained non-stop for 1 Year.

The 35W TDP gives us a upper limit of what the system will consume on persistent load - however the more interesting measure is the idle consuption as thats the state the system will be most of the time.

As I already tried some builds with different architectures, we have some interesting values to compare to, putting the A300 build in perspective

Build CPU Idle Load
Odroid U3 Exynos4412 3.7 W 9 W
Gigabyte BRIX Intel N3350 4.5 W 9.6 W
A300 Athlon 3000G 7.3 W 33.6 W

While you can obviously push the system towards 35W by with multiple simultaneous users, the 7.3 W idle consumption is quite nice.
Keep in mind, that the A300 was measured with two SATA drives operating as RAID1. If you only use one you can subtract 1W - at which point it is only 1.5 W away from the considerably weaker NUC system.

You might now wonder, whether the load or the idle measure is closer to the typical consumption. For this I measured the consumption for one day, which totalled at 0.18 kWh - or 7.5 Watts.

Power optimizations

To reach that 7.3 W idle, you need to tune some settings though. The most important one and luckily the easiest to fix is using a recent kernel.
If you are on Ubuntu 18.04, update to 20.04 or install the hwe kernel (5.4.0) - it saves you 4 Watts (11.3 to 7.3).

For saving about 0.5 watts, you can downgrade the network interface from 1Gbit to 100Mbit by executing

ethtool -s enp2s0 speed 100 duplex full autoneg on

Additionally, you can use Intels powertop to tune your system settings for power saving as

powertop --auto-tune

0 Add to favourites0 Bury

18 Nov 2020 8:16pm GMT

12 Nov 2020

feedPlanet Maemo

Imaginario 0.10

Today I released Imaginario 0.10. No bigger changes there, but two important bugfixes.

0 Add to favourites0 Bury

12 Nov 2020 4:12pm GMT