At Mercari, we’ve recently developed a system to track our application’s performance using Firebase Performance Monitoring. I’m going to introduce our approach in this post.
As iOS engineers, we care about our application performance metrics such as launch time and frame rate. Differences in performance can affect important metrics such as DAU and retention rate. That being said, it’s difficult to notice performance degradation without a system to monitor it.
At Mercari, we’ve recently developed a system to track our application’s performance using Firebase Performance Monitoring. I’m going to introduce our approach in this post.
You might already have some familiarity with leveraging tools like Time Profiler to observe how your application performs. However, unless you have a team that can constantly commit resources to improving the performance, you may fall into trouble. If you’re developing new product features daily, performance improvements will tend to take a back seat. Even if metrics get improved once, you may get a regression before you even notice it.
We don’t have engineers who always focus on mobile infrastructure — yet. Even though this is the case, we still want to make sure we notice any degradation of performance. Our goal here is to develop a system to monitor our iOS application’s performance that is low effort to implement and maintain.
One option is to build the foundation on your own. In the case of iOS, one possible approach would be to use measure(_:) in XCTest.framework to track the performance, run them on CI to analyze the results, and continuously execute that cycle to make sure that target operations are performing as expected. The Slack Android team provides an example of such an approach in this presentation.
They developed a performance testing pipeline that runs tests when PRs are opened, evaluates outcomes statistically, and sends engineers feedback before merging. The great part about this approach is that you can notice the degradation before releasing your app.
On the other hand, to build it and make it stable — as stated in the presentation — you would need to have a large amount of resources to commit to this foundation. The testing codebase and the whole CI system must be maintained over the long run too. It will be a balancing issue if your team cannot spare the time to perform maintenance; I’d normally say small teams like us cannot afford to take this approach.
Another option is to use prebuilt third-party solutions such as Firebase Performance Monitoring (hereafter FPM), Nimble, etc. The best part of it is that you don’t have to build/maintain a system by yourself. For FPM, you just need to add FirebaseApp.configure() to your project — you may already have it if you’re using other Firebase tools.
However, the disadvantages of this approach are:
In addition to these, FPM has its own distinct disadvantages:
Hopefully FPM will make improvements in the near future to address these drawbacks.
As an aside, MetricKit could be another choice to consider here (though you would need to build/maintain backend too). I excluded it because it unfortunately doesn’t provide frame metrics. Another reason is we preferably want to monitor our Android application’s performance in the same manner.
As described above, our goal is to develop a foundation that does not require much development or maintenance cost. Therefore, tracking by ourselves is not a good option at this point. Using FPM as it is doesn’t satisfy our needs either, because we cannot see the data on our major screens all at once. In the end, we decided to use FPM as a data source and to build the dashboard by ourselves.
Luckily, FPM enables us to export performance data into Google BigQuery. If you’re already using BigQuery, once you enable the link between them, it will automatically sync data once daily.
Export Performance Monitoring data to BigQuery | Firebase
The sample exported data is shown below. For example, if you want to find “app start trace” data, you can see records whose event_type is DURATION_TRACE, and event_name is _app_start. Note: many columns are omitted in the table below.
Our architecture is simple, as illustrated below. Aside from FPM, we send some data including binary sizes from CI (we use CircleCI) to BigQuery — binary sizes don’t necessarily have to be on BigQuery, but we’re currently using it for simplicity. The collected data is visualized on the dashboard (built on Looker in our case), and used to send alerts when performance degradation is detected.
The main metrics we monitor are as follows:
One of the most important metrics is launch time. “App start trace” is the corresponding metric we can retrieve from FPM. You can find its definition in the doc as follows:
Starts when the application loads the first Object to memory.
Stops after the first successful run loop that occurs after the application receives the UIApplicationDidBecomeActiveNotification notification.
Note that this is not the same as other metrics you may have measured before, such as DYLD_PRINT_STATISTICS.
In order to track the application’s responsiveness or smoothness on each screen, we use “frozen frame ratio.” The official documentation has the following definition:
The ratio of frozen frames for this screen trace, between 0 and 1. E.g. a value of 0.05 means 5% of the frames for this screen instance took more than 700ms to load.
You can find a more detailed explanation about frozen frame ratio on Android Developers’ documentation:
This is a problem because your app appears to be stuck and is unresponsive to user input for almost a full second while the frame is rendering. … No frames in your app should ever take longer than 700ms to render.
FPM provides another metric called “slow frame ratio,” meaning “the percentage of frames for each screen instance took more than 16ms to render.” In short, the frozen frame ratio is the extreme version of the slow frame ratio.
This may depend on the project, but in our case, the frozen frame ratio was useful enough for multiple screens that are important for us. Namely, some parts on those screens were very slow to render. Though the trend of poorly performing screens can be caught with slow frame ratio too, we have chosen the frozen frame ratio to pay more attention to larger issues.
On a different note, if you try to visualize frozen frame ratio, you’ll probably find the outcome is different from the graph on the FPM Web dashboard. The reason is simply because the data is a little bit adjusted. I haven’t been able to find the official explanation on this (except for the Google Play support page), but I received the following answer when I contacted the technical support team:
Firebase Performance Monitoring’s Frozen Frames refers to the percentage of screen instances during which more than 0.1% of frames took longer than 700 ms to render.
Therefore, to see the data in the same way as FPM dashboard, you need to calculate the ratio of records whose trace_info.screen_info.frozen_frame_ratio is greater than 0.001 for a specific period. For example, if you want to see frozen frame ratio of SmoothieViewController on each date, you may want to throw a query like this:
SELECT
cast(EXTRACT(DATE FROM event_timestamp) as string) AS date,
count(case when trace_info.screen_info.frozen_frame_ratio > 0.001 then 1else null end) / count(trace_info.screen_info.frozen_frame_ratio) as value,
FROM
`firebase_performance.TABLE_NAME`
WHERE
event_type = 'SCREEN_TRACE'
AND event_name = '_st_SmoothieViewController'
GROUP BY date
Let me show frozen frame ratio on one of our screens before and after we’ve made some improvements. The screen we selected to illustrate this example had several time-consuming operations such as creating unnecessary complex views, calling extra layoutIfNeeded multiple times, etc. on launch.
After several changes around the beginning-mid of May, we observed a drop in the time taken between init and the end of viewDidLoad— it became almost one-fourth of the original. As a result, as you can see in the graph, frozen frame ratio also gradually dropped and recently it has become stable at around 0.90% . The reason it didn’t have a rapid change is that we released improvements little-by-little across multiple versions, and also because we’re using phased releases.
Since binary sizes cannot be obtained via FPM, we get values directly from App Store Connect after uploading the binary and store them in BigQuery. We use the tunes_build_details function from Fastlane’s spaceship for this. The resulting Tunes::BuildDetails contains binary sizes and other parameters.
fastlane/application.rb at 2.150.3 · fastlane/fastlane · GitHub
Binary sizes are sent to BigQuery every time our submission lane completes.
You can choose any dashboard visualization tool that best suits your needs. We decided to use Looker for the following reasons:
Going back to our original goal, the most important thing we wanted to achieve was to build an environment where engineers (or other teammates) are able to quickly notice the degradation in performance. Fortunately, Looker provides alert support by default, and we are able to easily set up an alert with thresholds for any metric.
Instead of showing our actual dashboard, here’s an example with similar metrics & graphs:
One of the good things about using FPM as a data source is that you can create a dashboard that focuses on the important aspects of your product. Each app has multiple screens that can greatly affect key metrics and are a high priority for monitoring. The fact that we can collect all this data in one place is a big advantage.
We have a similar dashboard for Android too; both are posted on our Slack channel weekly to ensure regular monitoring by the team.
In this article, we introduced our approach to monitoring our iOS (and Android) applications’ performance, which utilizes Firebase Performance Monitoring as a data source. The beneficial part of this approach is absolutely the low dev cost — the largest part was built less within a week. Of course, the most suitable approach for you depends entirely on the state of your team. In the future, we may reconsider our architecture when our team grows larger. Ideally we would have an engineering team completely dedicated to mobile infrastructure. It would be best if we could notice the degradation before releasing our app, but using FPM is a tenable solution until then.
We’re still in the trial and error stage, and we’d love to hear your feedback. Thanks for reading!