【Android】Matrix 源码阅读笔记

Trace-Canary 模块

Trace 模块主要负责了一些卡顿的监测,如掉帧监测、启动时间监测、耗时方法监测、ANR 监测。

核心工具

LooperMonitor

在 Trace 模块中,通过 LooperMonitor 可以在 Looper 的 dispatchMessage 前后注册监听,从而完成一系列的监测,它是基于 Looper 中的 Printer 机制所实现,由于它会在 dispatchMessage 前后打印日志,因此可以通过它来在 dispatch 前后完成逻辑:

class LooperPrinter implements Printer {
    public Printer origin;
    boolean isHasChecked = false;
    boolean isValid = false;

    LooperPrinter(Printer printer) {
        this.origin = printer;
    }

    @Override
    public void println(String x) {
        if (null != origin) {
            origin.println(x);
            if (origin == this) {
                throw new RuntimeException(TAG + " origin == this");
            }
        }

        if (!isHasChecked) {
            isValid = x.charAt(0) == '>' || x.charAt(0) == '<';
            isHasChecked = true;
            if (!isValid) {
                MatrixLog.e(TAG, "[println] Printer is inValid! x:%s", x);
            }
        }

        if (isValid) {
            dispatch(x.charAt(0) == '>', x);
        }

    }
}

private void dispatch(boolean isBegin, String log) {
    for (LooperDispatchListener listener : listeners) {
        if (listener.isValid()) {
            if (isBegin) {
                if (!listener.isHasDispatchStart) {
                    listener.onDispatchStart(log);
                }
            } else {
                if (listener.isHasDispatchStart) {
                    listener.onDispatchEnd(log);
                }
            }
        } else if (!isBegin && listener.isHasDispatchStart) {
            listener.dispatchEnd();
        }
    }
}

由于 Looper 中的 Printer 只能存在一个,因此有其他库将 Printer 覆盖的风险,因此 LooperMonitor 中还利用了MessageQueueIdleHandler 机制,在 MessageQueue 空闲的时候去检查并覆盖 Printer,保证 Printer 没有被其他库覆盖

UIThreadMonitor

UIThreadMonitor 基于 LooperMonitor 的 Listener 进行了一层 UI 层面的监测回调,它在 dispatchBegin 的时候统计了 dispatch 的开始时间,并在其内部的 LooperObserver.dispatchBegin 中对开始时间进行了回调。

在它在的 dispatchEnd 中,如果这次的 dispatch 是在一帧中的,说明它是绘制相关的,此时就会调用doFrameEnd 方法表示一帧结束,继续向 VSync 中注册下一帧的 Callback。之后会计算结束和开始的时间差,并对有匹配的开始 dispatch 的 Observer 调用 doFrame 方法。

private void dispatchBegin() {
    token = dispatchTimeMs[0] = SystemClock.uptimeMillis();
    dispatchTimeMs[2] = SystemClock.currentThreadTimeMillis();
    AppMethodBeat.i(AppMethodBeat.METHOD_ID_DISPATCH);
    synchronized (observers) {
        for (LooperObserver observer : observers) {
            if (!observer.isDispatchBegin()) {
                observer.dispatchBegin(dispatchTimeMs[0], dispatchTimeMs[2], token);
            }
        }
    }
}

private void dispatchEnd() {
    if (isBelongFrame) {
        doFrameEnd(token);
    }
    long start = token;
    long end = SystemClock.uptimeMillis();
    synchronized (observers) {
        for (LooperObserver observer : observers) {
            if (observer.isDispatchBegin()) {
                observer.doFrame(AppMethodBeat.getVisibleScene(), token, SystemClock.uptimeMillis(), isBelongFrame ? end - start : 0, queueCost[CALLBACK_INPUT], queueCost[CALLBACK_ANIMATION], queueCost[CALLBACK_TRAVERSAL]);
            }
        }
    }
    dispatchTimeMs[3] = SystemClock.currentThreadTimeMillis();
    dispatchTimeMs[1] = SystemClock.uptimeMillis();
    AppMethodBeat.o(AppMethodBeat.METHOD_ID_DISPATCH);
    synchronized (observers) {
        for (LooperObserver observer : observers) {
            if (observer.isDispatchBegin()) {
                observer.dispatchEnd(dispatchTimeMs[0], dispatchTimeMs[2], dispatchTimeMs[1], dispatchTimeMs[3], token, isBelongFrame);
            }
        }
    }
}
如何确保一定是绘制的 Message 呢?

刚看这里的时候有点疑惑,UIThreadMonitor 是如何确定 dispatch 的是绘制的 Message 的呢?这个问题也就可以追溯到它是如何确定 isBelongFrame 的值的。它实际上是通过在 Choreographer 中将自己作为 Callback 进行了注册,从而实现的:

在添加 Callback 到 Choreographer 的过程中,如果直接调用其 postFrameCallback,会在这个过程中调用到 scheduleFrameLocked 导致除绘制以外额外的请求下一次 VSync 处理。因此它采用了通过反射获取到 mCallbackQueues,并调用其 addCallbackLocked 的方式来注册 Callback。

@Override
public void run() {
    final long start = System.nanoTime();
    try {
        doFrameBegin(token);
        doQueueBegin(CALLBACK_INPUT);
        addFrameCallback(CALLBACK_ANIMATION, new Runnable() {
            @Override
            public void run() {
                doQueueEnd(CALLBACK_INPUT);
                doQueueBegin(CALLBACK_ANIMATION);
            }
        }, true);
        addFrameCallback(CALLBACK_TRAVERSAL, new Runnable() {
            @Override
            public void run() {
                doQueueEnd(CALLBACK_ANIMATION);
                doQueueBegin(CALLBACK_TRAVERSAL);
            }
        }, true);
    } finally {
        if (config.isDevEnv()) {
            MatrixLog.d(TAG, "[UIThreadMonitor#run] inner cost:%sns", System.nanoTime() - start);
        }
    }
}

doFrameBegin 中就将 isBelongFrame 设置为了 true,而在 doFrameEnd 的时候又将其设置为了 false。

并且,由于 CallbackQueue 中将 Callback 分为了 INPUTANIMATIONTRAVERSAL 三种,并且在 Choreographer 中对三种 Callback 的执行实际上是有优先级的:

FrameInfo.markInputHandlingStart();
doCallbacks(Choreographer.CALLBACK_INPUT, frameTimeNanos);

mFrameInfo.markAnimationsStart();
doCallbacks(Choreographer.CALLBACK_ANIMATION, frameTimeNanos);

mFrameInfo.markPerformTraversalsStart();
doCallbacks(Choreographer.CALLBACK_TRAVERSAL, frameTimeNanos);

因此它在上面的 run 方法中分别对三种 Callback 进行了注册,并分别记录每一种 Callback 的开始时间以及结束时间,从而使得能获取更详细的每种 Callback 的耗时时间。

private void doQueueBegin(int type) {
    queueStatus[type] = DO_QUEUE_BEGIN;
    queueCost[type] = System.nanoTime();
}
private void doQueueEnd(int type) {
    queueStatus[type] = DO_QUEUE_END;
    queueCost[type] = System.nanoTime() - queueCost[type];
    synchronized (this) {
        callbackExist[type] = false;
    }
}

AppMethodBeat

方法调用插桩

Matrix 中,通过 ASM 在方法的调用前后进行了插桩,从而使得可以对方法的调用时间等进行统计。在对方法插桩前,它会根据本地的 mapping 文件解析出每个方法对应的方法 id,并声称一个对应的 collectedMethodMap 存储于 MethodCollector 中。

@Override
public MethodVisitor visitMethod(int access, String name, String desc,
                                 String signature, String[] exceptions) {
    if (isABSClass) {
        return super.visitMethod(access, name, desc, signature, exceptions);
    } else {
        if (!hasWindowFocusMethod) {
            hasWindowFocusMethod = MethodCollector.isWindowFocusChangeMethod(name, desc);
        }
        MethodVisitor methodVisitor = cv.visitMethod(access, name, desc, signature, exceptions);
        return new TraceMethodAdapter(api, methodVisitor, access, name, desc, this.className,
                hasWindowFocusMethod, isActivityOrSubClass, isNeedTrace);
    }
}

这里只对非抽象类的类进行了修改,它通过 TraceMethodAdapter,在方法前后分别调用了 AppMethodBeat.iAppMethodBeat.o

@Override
protected void onMethodEnter() {
    TraceMethod traceMethod = collectedMethodMap.get(methodName);
    if (traceMethod != null) {
        traceMethodCount.incrementAndGet();
        mv.visitLdcInsn(traceMethod.id);
        mv.visitMethodInsn(INVOKESTATIC, TraceBuildConstants.MATRIX_TRACE_CLASS, "i", "(I)V", false);
    }
}

@Override
protected void onMethodExit(int opcode) {
    TraceMethod traceMethod = collectedMethodMap.get(methodName);
    if (traceMethod != null) {
        //...
        traceMethodCount.incrementAndGet();
        mv.visitLdcInsn(traceMethod.id);
        mv.visitMethodInsn(INVOKESTATIC, TraceBuildConstants.MATRIX_TRACE_CLASS, "o", "(I)V", false);
    }
}

AppMethodBeat.i 中,维护了一个 long 类型的 sBuffer,存储了方法进入/退出的相关信息,其格式如下:

可以看到,前面第一位表达了对应的进/出状态,之后用 20 位存储了对应方法的 id,最后的 43 位则存储了其时间戳,并且这个时间戳实际上是以一个差的形式存放的,在 AppMethodBeat 类加载时,它会记录一个基准时间 sDiffTime,之后的时间戳都是基于该基准时间的一个时间差。

并且由于每次方法调用/退出时都调用 SystemClock 的方法来获取当前时间的话,实际上是比较损耗效率的,可能带来性能问题。因此它基于 Handler 开启了一个定时(5ms)的任务,来定时更新当前时间,每个方法需要时间统计时都从这个变量获取。由于低于 5ms 的方法调用开发者认为并不属于耗时方法,因此不需要那么精准的时间差。这样就可以提升我们的获取时间的效率。

sHandler.removeCallbacksAndMessages(null);
sHandler.postDelayed(sUpdateDiffTimeRunnable, Constants.TIME_UPDATE_CYCLE_MS);

private static Runnable sUpdateDiffTimeRunnable = new Runnable() {
    @Override
    public void run() {
        try {
            while (true) {
                while (!isPauseUpdateTime && status > STATUS_STOPPED) {
                    sCurrentDiffTime = SystemClock.uptimeMillis() - sDiffTime;
                    SystemClock.sleep(Constants.TIME_UPDATE_CYCLE_MS);
                }
                synchronized (updateTimeLock) {
                    updateTimeLock.wait();
                }
            }
        } catch (InterruptedException e) {
            MatrixLog.e(TAG, "" + e.toString());
        }
    }
};

AppMethodBeat.iAppMethodBeat.o 中,它们都是向 sBuffer 中插入了对应的元素,并且它们都对线程的 id 进行了判断,确保只统计主线程的耗时方法

public static void i(int methodId) {
    if (status <= STATUS_STOPPED) {
        return;
    }
    if (methodId >= METHOD_ID_MAX) {
        return;
    }
    if (status == STATUS_DEFAULT) {
        synchronized (statusLock) {
            if (status == STATUS_DEFAULT) {
                realExecute();
                status = STATUS_READY;
            }
        }
    }
    long threadId = Thread.currentThread().getId();
    if (sMethodEnterListener != null) {
        sMethodEnterListener.enter(methodId, threadId);
    }
    if (threadId == sMainThreadId) {
        if (assertIn) {
            android.util.Log.e(TAG, "ERROR!!! AppMethodBeat.i Recursive calls!!!");
            return;
        }
        assertIn = true;
        if (sIndex < Constants.BUFFER_SIZE) {
            mergeData(methodId, sIndex, true);
        } else {
            sIndex = 0;
            mergeData(methodId, sIndex, true);
        }
        ++sIndex;
        assertIn = false;
    }
}

public static void o(int methodId) {
    if (status <= STATUS_STOPPED) {
        return;
    }
    if (methodId >= METHOD_ID_MAX) {
        return;
    }
    if (Thread.currentThread().getId() == sMainThreadId) {
        if (sIndex < Constants.BUFFER_SIZE) {
            mergeData(methodId, sIndex, false);
        } else {
            sIndex = 0;
            mergeData(methodId, sIndex, false);
        }
        ++sIndex;
    }
}
onWindowFocusChanged 生命周期插桩

Matrix 中由于需要统计 Activity 启动所带来的耗时,因此需要 hook 其 onWindowFocusChanged 方法,它会通过 ASM 在 onWindowFocusChanged 方法调用后进行插桩:

@Override
protected void onMethodExit(int opcode) {
    TraceMethod traceMethod = collectedMethodMap.get(methodName);
    if (traceMethod != null) {
        if (hasWindowFocusMethod && isActivityOrSubClass && isNeedTrace) {
            TraceMethod windowFocusChangeMethod = TraceMethod.create(-1, Opcodes.ACC_PUBLIC, className, TraceBuildConstants.MATRIX_TRACE_ON_WINDOW_FOCUS_METHOD, TraceBuildConstants.MATRIX_TRACE_ON_WINDOW_FOCUS_METHOD_ARGS);
            if (windowFocusChangeMethod.equals(traceMethod)) {
                traceWindowFocusChangeMethod(mv, className);
            }
        }
        // ...
    }
}

它对所有属于当前应用并重写了 onWindowFocusChanged 方法的 Activity 的 onWindowFocusMethod 方法进行了插桩,最终调用到了 AppMethodBead.at 方法:

private void traceWindowFocusChangeMethod(MethodVisitor mv, String classname) {
    mv.visitVarInsn(Opcodes.ALOAD, 0);
    mv.visitVarInsn(Opcodes.ILOAD, 1);
    mv.visitMethodInsn(Opcodes.INVOKESTATIC, TraceBuildConstants.MATRIX_TRACE_CLASS, "at", "(Landroid/app/Activity;Z)V", false);
}

其 at 方法会遍历所有的 Listener 并回调其 onActivityFocused 方法,并且通过 sFocusActivitySet 保证了这个回调只会执行一次

public static void at(Activity activity, boolean isFocus) {
    String activityName = activity.getClass().getName();
    if (isFocus) {
        if (sFocusActivitySet.add(activityName)) {
            synchronized (listeners) {
                for (IAppMethodBeatListener listener : listeners) {
                    listener.onActivityFocused(activityName);
                }
            }
            MatrixLog.i(TAG, "[at] visibleScene[%s] has %s focus!", getVisibleScene(), "attach");
        }
    } else {
        if (sFocusActivitySet.remove(activityName)) {
            MatrixLog.i(TAG, "[at] visibleScene[%s] has %s focus!", getVisibleScene(), "detach");
        }
    }
}

掉帧监测

Trace 模块通过了 FrameTracer 来进行掉帧的监测,它会将自己注册到 UIThreadMonitor 中,每当一帧到来的时候,如果当前 App 正处于前台的话,会调用 notifyListener 来通知其 Listener:

@Override
public void doFrame(String focusedActivityName, long start, long end, long frameCostMs, long inputCostNs, long animationCostNs, long traversalCostNs) {
    if (isForeground()) {
        notifyListener(focusedActivityName, end - start, frameCostMs, frameCostMs >= 0);
    }
}

private void notifyListener(final String visibleScene, final long taskCostMs, final long frameCostMs, final boolean isContainsFrame) {
    long start = System.currentTimeMillis();
    try {
        synchronized (listeners) {
            for (final IDoFrameListener listener : listeners) {
                // ...
                final int dropFrame = (int) (taskCostMs / frameIntervalMs);
                listener.doFrameSync(visibleScene, taskCostMs, frameCostMs, dropFrame, isContainsFrame);
                // ...
            }
        }
    } finally {
       // ...
    }
}

可以看到,上面计算了 dispatch 所花的时间除以每一帧的时间(16.66 ms),所得的结果就是我们的掉帧数。(因为如果在一帧内完成则得到的结果是 0)。而在 FrameTracer 中默认有一个唯一的 Listener——FPSCollector,它负责了掉帧的收集工作:

private class FPSCollector extends IDoFrameListener {

    private HashMap<String, FrameCollectItem> map = new HashMap<>();

    @Override
    public void doFrameAsync(String visibleScene, long taskCost, long frameCostMs, int droppedFrames, boolean isContainsFrame) {
        super.doFrameAsync(visibleScene, taskCost, frameCostMs, droppedFrames, isContainsFrame);
        if (Utils.isEmpty(visibleScene)) {
            return;
        }

        FrameCollectItem item = map.get(visibleScene);
        if (null == item) {
            item = new FrameCollectItem(visibleScene);
            map.put(visibleScene, item);
        }

        item.collect(droppedFrames, isContainsFrame);

        if (item.sumFrameCost >= timeSliceMs) { // report
            map.remove(visibleScene);
            item.report();
        }
    }
}

它会首先获取到当前 Activity 所对应的 FrameCollectItem,之后在 item.collect 中会将掉帧数划分为不同的区间,将掉帧程度进行分类。对于每个 Activity,每隔 10s 会进行一次掉帧情况的上报,它是通过 Issue 类以 JSON 的格式进行上报的。

void collect(int droppedFrames, boolean isContainsFrame) {
    long frameIntervalCost = UIThreadMonitor.getMonitor().getFrameIntervalNanos();
    sumFrameCost += (droppedFrames + 1) * frameIntervalCost / Constants.TIME_MILLIS_TO_NANO;
    sumDroppedFrames += droppedFrames;
    sumFrame++;
    if (!isContainsFrame) {
        sumTaskFrame++;
    }
    if (droppedFrames >= frozenThreshold) {
        dropLevel[DropStatus.DROPPED_FROZEN.index]++;
        dropSum[DropStatus.DROPPED_FROZEN.index] += droppedFrames;
    } else if (droppedFrames >= highThreshold) {
        dropLevel[DropStatus.DROPPED_HIGH.index]++;
        dropSum[DropStatus.DROPPED_HIGH.index] += droppedFrames;
    } else if (droppedFrames >= middleThreshold) {
        dropLevel[DropStatus.DROPPED_MIDDLE.index]++;
        dropSum[DropStatus.DROPPED_MIDDLE.index] += droppedFrames;
    } else if (droppedFrames >= normalThreshold) {
        dropLevel[DropStatus.DROPPED_NORMAL.index]++;
        dropSum[DropStatus.DROPPED_NORMAL.index] += droppedFrames;
    } else {
        dropLevel[DropStatus.DROPPED_BEST.index]++;
        dropSum[DropStatus.DROPPED_BEST.index] += (droppedFrames < 0 ? 0 : droppedFrames);
    }
}

可以看到,Matrix 中将掉帧级别分为了如下表的几种类型:

Best Normal Middle High Frozen
[0:3) [3:9) [9:24) [24:42) [42:∞)

耗时方法监测

Matrix 中的耗时方法监测主要是基于 AppMethodBeat 所收集的 sBuffer 信息所实现的,而 EvilMethodTracer 主要负责了数据的上报。

EvilMethodTracer 的上报基于 UIThreadMonitor 注册监听实现,在 dispatchBegin 时通过 AppMethodBeat.makeIndex 方法获取到了一个对应了当前的 sBuffer 的 index 的 IndexRecord 对象,用于表示在开始 dispatch 时的方法 index,这些 IndexRecord 是以链表的形式存放的。

@Override
public void dispatchBegin(long beginMs, long cpuBeginMs, long token) {
    super.dispatchBegin(beginMs, cpuBeginMs, token);
    indexRecord = AppMethodBeat.getInstance().maskIndex("EvilMethodTracer#dispatchBegin");
}

之后在 dispatchEnd 中,它会计算 dispatch 的时间是否超过了设定的阈值(700ms),若超过了阈值则会将这段数据上报。它会通过 AppMethodBeat.copyData 方法获取到从 dispatch 开始时的 index 到当前的 index 范围内的所有方法进出数据,之后会在 Matrix 提供的 HandlerThread 中运行一个 AnalyseTask 来对这段数据进行分析。通过这样的方式,它就能统计到每次 dispatch 消息的过程中的方法耗时信息。

@Override
public void dispatchEnd(long beginMs, long cpuBeginMs, long endMs, long cpuEndMs, long token, boolean isBelongFrame) {
    super.dispatchEnd(beginMs, cpuBeginMs, endMs, cpuEndMs, token, isBelongFrame);
    long start = config.isDevEnv() ? System.currentTimeMillis() : 0;
    try {
        long dispatchCost = endMs - beginMs;
        if (dispatchCost >= evilThresholdMs) {
            long[] data = AppMethodBeat.getInstance().copyData(indexRecord);
            long[] queueCosts = new long[3];
            System.arraycopy(queueTypeCosts, 0, queueCosts, 0, 3);
            String scene = AppMethodBeat.getVisibleScene();
            MatrixHandlerThread.getDefaultHandler().post(new AnalyseTask(isForeground(), scene, data, queueCosts, cpuEndMs - cpuBeginMs, endMs - begin
        }
    } finally {
        indexRecord.release();
        if (config.isDevEnv()) {
            String usage = Utils.calculateCpuUsage(cpuEndMs - cpuBeginMs, endMs - beginMs);
            MatrixLog.v(TAG, "[dispatchEnd] token:%s cost:%sms cpu:%sms usage:%s innerCost:%s",
                    token, endMs - beginMs, cpuEndMs - cpuBeginMs, usage, System.currentTimeMillis() - start);
        }
    }
}

AnalyseTask 中,调用了 TraceDataUtils.structuredDataToStack 将得到的这段 data 数据转换为了一个对应的 MethodItem 栈,对应函数的调用栈,记录了方法的 id、调用时间,深度等等。

之后它调用了 TraceDataUtils.trimStack 并根据传入的 Filter 对调用栈进行了裁剪,将一些额外的信息去除。

TraceDataUtils.structuredDataToStack(data, stack, true, endMs);
TraceDataUtils.trimStack(stack, Constants.TARGET_EVIL_METHOD_STACK, new TraceDataUtils.IStructuredDataFilter() {
    @Override
    public boolean isFilter(long during, int filterCount) {
        return during < filterCount * Constants.TIME_UPDATE_CYCLE_MS;
    }

    @Override
    public int getFilterMaxCount() {
        return Constants.FILTER_STACK_MAX_COUNT;
    }

    @Override
    public void fallback(List<MethodItem> stack, int size) {
        MatrixLog.w(TAG, "[fallback] size:%s targetSize:%s stack:%s", size, Constants.TARGET_EVIL_METHOD_STACK, stack);
        Iterator iterator = stack.listIterator(Math.min(size, Constants.TARGET_EVIL_METHOD_STACK));
        while (iterator.hasNext()) {
            iterator.next();
            iterator.remove();
        }
    }
});

在裁剪的过程中,它会在不满足 targetCount(30)的情况下,不断根据 isFilter 的条件对调用栈进行裁剪,这里就是不断地将执行时间小于 5ms 的方法调用信息进行裁剪,若裁剪后仍不足,则会将时间加倍,继续裁剪(5ms -> 10ms -> 15ms ->...)

public static void trimStack(List<MethodItem> stack, int targetCount, IStructuredDataFilter filter) {
    if (0 > targetCount) {
        stack.clear();
        return;
    }
    int filterCount = 1;
    int curStackSize = stack.size();
    while (curStackSize > targetCount) {
        ListIterator<MethodItem> iterator = stack.listIterator(stack.size());
        while (iterator.hasPrevious()) {
            MethodItem item = iterator.previous();
            if (filter.isFilter(item.durTime, filterCount)) {
                iterator.remove();
                curStackSize--;
                if (curStackSize <= targetCount) {
                    return;
                }
            }
        }
        curStackSize = stack.size();
        filterCount++;
        if (filter.getFilterMaxCount() < filterCount) {
            break;
        }
    }
    int size = stack.size();
    if (size > targetCount) {
        filter.fallback(stack, size);
    }
}

对调用栈裁剪完成后,它就会通过 Issue 以 JSON 的格式将耗时方法监测结果上报。

启动监测

AppMethodBeat 中对 onWindowFocusChanged 方法进行了 hook,在 StartupTracer 中则在 onWindowFocusChanged 方法被调用时进行了冷启动与热启动的耗时统计,从 Matrix 的 Java Doc 中可以看到,应用的启动过程被分割为下面的几个步骤:

image-20200510203510768

具体代码如下:

@Override
public void onActivityFocused(String activity) {
    if (isColdStartup()) {
        if (firstScreenCost == 0) {
            this.firstScreenCost = uptimeMillis() - ActivityThreadHacker.getEggBrokenTime();
        }
        if (hasShowSplashActivity) {
            coldCost = uptimeMillis() - ActivityThreadHacker.getEggBrokenTime();
        } else {
            if (splashActivities.contains(activity)) {
                hasShowSplashActivity = true;
            } else if (splashActivities.isEmpty()) {
                MatrixLog.i(TAG, "default splash activity[%s]", activity);
                coldCost = firstScreenCost;
            } else {
                MatrixLog.w(TAG, "pass this activity[%s] at duration of start up! splashActivities=%s", activity, splashActivities);
            }
        }
        if (coldCost > 0) {
            analyse(ActivityThreadHacker.getApplicationCost(), firstScreenCost, coldCost, false);
        }
    } else if (isWarmStartUp()) {
        isWarmStartUp = false;
        long warmCost = uptimeMillis() - ActivityThreadHacker.getLastLaunchActivityTime();
        if (warmCost > 0) {
            analyse(ActivityThreadHacker.getApplicationCost(), firstScreenCost, warmCost, true);
        }
    }
}

与图中的情况吻合,其中 EggBrokenTime 是整个应用第一个方法调用的时机。

private void analyse(long applicationCost, long firstScreenCost, long allCost, boolean isWarmStartUp) {
    MatrixLog.i(TAG, "[report] applicationCost:%s firstScreenCost:%s allCost:%s isWarmStartUp:%s", applicationCost, firstScreenCost, allCost, isWarmStartUp);
    long[] data = new long[0];
    if (!isWarmStartUp && allCost >= coldStartupThresholdMs) { // for cold startup
        data = AppMethodBeat.getInstance().copyData(ActivityThreadHacker.sApplicationCreateBeginMethodIndex);
        ActivityThreadHacker.sApplicationCreateBeginMethodIndex.release();
    } else if (isWarmStartUp && allCost >= warmStartupThresholdMs) {
        data = AppMethodBeat.getInstance().copyData(ActivityThreadHacker.sLastLaunchActivityMethodIndex);
        ActivityThreadHacker.sLastLaunchActivityMethodIndex.release();
    }
    MatrixHandlerThread.getDefaultHandler().post(new AnalyseTask(data, applicationCost, firstScreenCost, allCost, isWarmStartUp, ActivityThreadHacker.sApplicationCreateScene));
}

这里对冷启动和热启动的时间分别进行了判断,超过阈值则会将这段启动时间内调用的方法进行收集并经过裁剪上报。

ANR 检测

ANR 检测的核心原理是通过在 dispatchBegin 的时候构造一个 AnrTask 消息,并延迟 5s post 出去,若在 5s 内能够处理,则会 remove 该消息。从而模拟 ANR 的产生。

@Override
public void dispatchBegin(long beginMs, long cpuBeginMs, long token) {
    super.dispatchBegin(beginMs, cpuBeginMs, token);
    anrTask = new AnrHandleTask(AppMethodBeat.getInstance().maskIndex("AnrTracer#dispatchBegin"), token);
    if (traceConfig.isDevEnv()) {
        MatrixLog.v(TAG, "* [dispatchBegin] token:%s index:%s", token, anrTask.beginRecord.index);
    }
    anrHandler.postDelayed(anrTask, Constants.DEFAULT_ANR - (SystemClock.uptimeMillis() - token));
}

@Override
public void dispatchEnd(long beginMs, long cpuBeginMs, long endMs, long cpuEndMs, long token, boolean isBelongFrame) {
    super.dispatchEnd(beginMs, cpuBeginMs, endMs, cpuEndMs, token, isBelongFrame);
    if (traceConfig.isDevEnv()) {
        MatrixLog.v(TAG, "[dispatchEnd] token:%s cost:%sms cpu:%sms usage:%s",
                token, endMs - beginMs, cpuEndMs - cpuBeginMs, Utils.calculateCpuUsage(cpuEndMs - cpuBeginMs, endMs - beginMs));
    }
    if (null != anrTask) {
        anrTask.getBeginRecord().release();
        anrHandler.removeCallbacks(anrTask);
    }
}

AnrTask 被触发时,它会收集进程、内存、线程状态、帧数、调用栈等相关的信息,然后进行上报。

IO-Canary 模块

它会在 start 时通过 initDetectorsAndHookers 进行 hook 以及 detector 运行,从而检测 IO 相关的问题。

private void initDetectorsAndHookers(IOConfig ioConfig) {
    assert ioConfig != null;
    if (ioConfig.isDetectFileIOInMainThread()
        || ioConfig.isDetectFileIOBufferTooSmall()
        || ioConfig.isDetectFileIORepeatReadSameFile()) {
        IOCanaryJniBridge.install(ioConfig, this);
    }
    //if only detect io closeable leak use CloseGuardHooker is Better
    if (ioConfig.isDetectIOClosableLeak()) {
        mCloseGuardHooker = new CloseGuardHooker(this);
        mCloseGuardHooker.hook();
    }
}

其中 IOCanaryJniBridge.install 会调用 enableDetector 方法到 Native 层启动几个 Detector,并调用 doHook 到 Native 层注册对应的 native hook。

public static void install(IOConfig config, OnJniIssuePublishListener listener) {
    MatrixLog.v(TAG, "install sIsTryInstall:%b", sIsTryInstall);
    if (sIsTryInstall) {
        return;
    }

    //load lib
    if (!loadJni()) {
        MatrixLog.e(TAG, "install loadJni failed");
        return;
    }

    //set listener
    sOnIssuePublishListener = listener;

    try {
        //set config
        if (config != null) {
            if (config.isDetectFileIOInMainThread()) {
                enableDetector(DetectorType.MAIN_THREAD_IO);
                // ms to μs
                setConfig(ConfigKey.MAIN_THREAD_THRESHOLD, config.getFileMainThreadTriggerThreshold() * 1000L);
            }

            if (config.isDetectFileIOBufferTooSmall()) {
                enableDetector(DetectorType.SMALL_BUFFER);
                setConfig(ConfigKey.SMALL_BUFFER_THRESHOLD, config.getFileBufferSmallThreshold());
            }

            if (config.isDetectFileIORepeatReadSameFile()) {
                enableDetector(DetectorType.REPEAT_READ);
                setConfig(ConfigKey.REPEAT_READ_THRESHOLD, config.getFileRepeatReadThreshold());
            }
        }

        //hook
        doHook();

        sIsTryInstall = true;
    } catch (Error e) {
        MatrixLog.printErrStackTrace(TAG, e, "call jni method error");
    }
}

IO Native Hook

doHook 方法中通过 xhook 将对应的 openopen64read__read_chkwrite__write_chkclose 几个函数进行替代,从而植入数据收集逻辑。

JNIEXPORT jboolean JNICALL
Java_com_tencent_matrix_iocanary_core_IOCanaryJniBridge_doHook(JNIEnv *env, jclass type) {
    __android_log_print(ANDROID_LOG_INFO, kTag, "doHook");
    for (int i = 0; i < TARGET_MODULE_COUNT; ++i) {
        const char* so_name = TARGET_MODULES[i];
        __android_log_print(ANDROID_LOG_INFO, kTag, "try to hook function in %s.", so_name);
        void* soinfo = xhook_elf_open(so_name);
        if (!soinfo) {
            __android_log_print(ANDROID_LOG_WARN, kTag, "Failure to open %s, try next.", so_name);
            continue;
        }
        xhook_hook_symbol(soinfo, "open", (void*)ProxyOpen, (void**)&original_open);
        xhook_hook_symbol(soinfo, "open64", (void*)ProxyOpen64, (void**)&original_open64);
        bool is_libjavacore = (strstr(so_name, "libjavacore.so") != nullptr);
        if (is_libjavacore) {
            if (xhook_hook_symbol(soinfo, "read", (void*)ProxyRead, (void**)&original_read) != 0) {
                __android_log_print(ANDROID_LOG_WARN, kTag, "doHook hook read failed, try __read_chk");
                if (xhook_hook_symbol(soinfo, "__read_chk", (void*)ProxyReadChk, (void**)&original_read_chk) != 0) {
                    __android_log_print(ANDROID_LOG_WARN, kTag, "doHook hook failed: __read_chk");
                    xhook_elf_close(soinfo);
                    return JNI_FALSE;
                }
            }
            if (xhook_hook_symbol(soinfo, "write", (void*)ProxyWrite, (void**)&original_write) != 0) {
                __android_log_print(ANDROID_LOG_WARN, kTag, "doHook hook write failed, try __write_chk");
                if (xhook_hook_symbol(soinfo, "__write_chk", (void*)ProxyWriteChk, (void**)&original_write_chk) != 0) {
                    __android_log_print(ANDROID_LOG_WARN, kTag, "doHook hook failed: __write_chk");
                    xhook_elf_close(soinfo);
                    return JNI_FALSE;
                }
            }
        }
        xhook_hook_symbol(soinfo, "close", (void*)ProxyClose, (void**)&original_close);
        xhook_elf_close(soinfo);
    }
    __android_log_print(ANDROID_LOG_INFO, kTag, "doHook done.");
    return JNI_TRUE;
}

open

对 open 的 hook 我们主要针对主线程,真正的代理逻辑调用到了 DoProxyOpenLogic 函数。

int ProxyOpen(const char *pathname, int flags, mode_t mode) {
    if(!IsMainThread()) {
        return original_open(pathname, flags, mode);
    }
    int ret = original_open(pathname, flags, mode);
    if (ret != -1) {
        DoProxyOpenLogic(pathname, flags, mode, ret);
    }
    return ret;
}

DoProxyOpenLogic 中,主要是获取了 Java 端的堆栈、线程名的信息,之后调用了 OnOpen 的回调,它会最终调用到 IOInfoCollector.OnOpen

static void DoProxyOpenLogic(const char *pathname, int flags, mode_t mode, int ret) {
    JNIEnv* env = NULL;
    kJvm->GetEnv((void**)&env, JNI_VERSION_1_6);
    if (env == NULL || !kInitSuc) {
        __android_log_print(ANDROID_LOG_ERROR, kTag, "ProxyOpen env null or kInitSuc:%d", kInitSuc);
    } else {
        jobject java_context_obj = env->CallStaticObjectMethod(kJavaBridgeClass, kMethodIDGetJavaContext);
        if (NULL == java_context_obj) {
            return;
        }
        jstring j_stack = (jstring) env->GetObjectField(java_context_obj, kFieldIDStack);
        jstring j_thread_name = (jstring) env->GetObjectField(java_context_obj, kFieldIDThreadName);
        char* thread_name = jstringToChars(env, j_thread_name);
        char* stack = jstringToChars(env, j_stack);
        JavaContext java_context(GetCurrentThreadId(), thread_name == NULL ? "" : thread_name, stack == NULL ? "" : stack);
        free(stack);
        free(thread_name);
        iocanary::IOCanary::Get().OnOpen(pathname, flags, mode, ret, java_context);
        env->DeleteLocalRef(java_context_obj);
        env->DeleteLocalRef(j_stack);
        env->DeleteLocalRef(j_thread_name);
    }
}

IOInfoCollector.OnOpen 中记录下了前面获取到的信息,并存入 map:

void IOInfoCollector::OnOpen(const char *pathname, int flags, mode_t mode
        , int open_ret, const JavaContext& java_context) {
    if (open_ret == -1) {
        return;
    }
    if (info_map_.find(open_ret) != info_map_.end()) {
        return;
    }
    std::shared_ptr<IOInfo> info = std::make_shared<IOInfo>(pathname, java_context);
    info_map_.insert(std::make_pair(open_ret, info));
}

read & writre

read 和 write 的整体逻辑基本相似,因此这里仅拿 read 进行分析。

对 read 的 hook 同样也是针对主线程,它获取了 read 的耗时,然后调用了 OnRead 回调,它也会调用到 IOInfoCollector.OnRead

ssize_t ProxyRead(int fd, void *buf, size_t size) {
    if(!IsMainThread()) {
        return original_read(fd, buf, size);
    }
    int64_t start = GetTickCountMicros();
    size_t ret = original_read(fd, buf, size);
    long read_cost_us = GetTickCountMicros() - start;
    iocanary::IOCanary::Get().OnRead(fd, buf, size, ret, read_cost_us);
    return ret;
}

IOInfoCollector.OnRead 中,会检查这个 fd 是否是之前监控的打开的 fd,如果是的话则调用 CountRWInfo 进行读写的信息统计。

void IOInfoCollector::OnRead(int fd, const void *buf, size_t size,
                             ssize_t read_ret, long read_cost) {
    if (read_ret == -1 || read_cost < 0) {
        return;
    }
    if (info_map_.find(fd) == info_map_.end()) {
        return;
    }
    CountRWInfo(fd, FileOpType::kRead, size, read_cost);
}

在其中会记录下操作的次数、操作的 size、操作的耗时,并统计如最大单次写入耗时、最大连续写入耗时、buffer 大小等信息,从而供 detector 进行分析。

void IOInfoCollector::CountRWInfo(int fd, const FileOpType &fileOpType, long op_size, long rw_cost) {
    if (info_map_.find(fd) == info_map_.end()) {
        return;
    }
    const int64_t now = GetSysTimeMicros();
    info_map_[fd]->op_cnt_ ++;
    info_map_[fd]->op_size_ += op_size;
    info_map_[fd]->rw_cost_us_ += rw_cost;
    if (rw_cost > info_map_[fd]->max_once_rw_cost_time_μs_) {
        info_map_[fd]->max_once_rw_cost_time_μs_ = rw_cost;
    }
    if (info_map_[fd]->last_rw_time_μs_ > 0 && (now - info_map_[fd]->last_rw_time_μs_) < kContinualThreshold) {
        info_map_[fd]->current_continual_rw_time_μs_ += rw_cost;
    } else {
        info_map_[fd]->current_continual_rw_time_μs_ = rw_cost;
    }
    if (info_map_[fd]->current_continual_rw_time_μs_ > info_map_[fd]->max_continual_rw_cost_time_μs_) {
        info_map_[fd]->max_continual_rw_cost_time_μs_ = info_map_[fd]->current_continual_rw_time_μs_;
    }
    info_map_[fd]->last_rw_time_μs_ = now;
    if (info_map_[fd]->buffer_size_ < op_size) {
        info_map_[fd]->buffer_size_ = op_size;
    }
    if (info_map_[fd]->op_type_ == FileOpType::kInit) {
        info_map_[fd]->op_type_ = fileOpType;
    }
}

close

close 的 hook 同样针对主线程,它会计算出总耗时、文件大小等信息,然后将其从 map 中移除后交给上层,上层会释放需要该 I/O 数据的锁,并将其返回给 TakeFileIOInfo 的调用者。

std::shared_ptr<IOInfo> IOInfoCollector::OnClose(int fd, int close_ret) {
    if (info_map_.find(fd) == info_map_.end()) {
        return nullptr;
    }
    info_map_[fd]->total_cost_μs_ = GetSysTimeMicros() - info_map_[fd]->start_time_μs_;
    info_map_[fd]->file_size_ = GetFileSize(info_map_[fd]->path_.c_str());
    std::shared_ptr<IOInfo> info = info_map_[fd];
    info_map_.erase(fd);
    return info;
}

Detector

Matrix 中通过 Detector 主动分析 I/O 数据从而检测是否存在一些 I/O 问题,如主线程 I/O、重复读取、Buffer 过小等问题。它们的数据都依赖于前面 hook 所获取到的数据。

主线程 I/O

主线程 I/O 检测会在 I/O 最大单次读写时间或 I/O 最大连续读写时间超过了对应设置的阈值时,进行上报。

void FileIOMainThreadDetector::Detect(const IOCanaryEnv &env, const IOInfo &file_io_info,
                                      std::vector<Issue>& issues) {
    if (GetMainThreadId() == file_io_info.java_context_.thread_id_) {
        int type = 0;
        if (file_io_info.max_once_rw_cost_time_μs_ > IOCanaryEnv::kPossibleNegativeThreshold) {
            type = 1;
        }
        if(file_io_info.max_continual_rw_cost_time_μs_ > env.GetMainThreadThreshold()) {
            type |= 2;
        }
        if (type != 0) {
            Issue issue(kType, file_io_info);
            issue.repeat_read_cnt_ = type;  //use repeat to record type
            PublishIssue(issue, issues);
        }
    }
}

重复读取

重复读取检测会遍历存储的列表,寻找是否存在重复读取的情况,若存在则会记录并在重复读取超过阈值时进行上报。

void FileIORepeatReadDetector::Detect(const IOCanaryEnv &env,
                                      const IOInfo &file_io_info,
                                      std::vector<Issue>& issues) {
    const std::string& path = file_io_info.path_;
    if (observing_map_.find(path) == observing_map_.end()) {
        if (file_io_info.max_continual_rw_cost_time_μs_ < env.kPossibleNegativeThreshold) {
            return;
        }
        observing_map_.insert(std::make_pair(path, std::vector<RepeatReadInfo>()));
    }
    std::vector<RepeatReadInfo>& repeat_infos = observing_map_[path];
    if (file_io_info.op_type_ == FileOpType::kWrite) {
        repeat_infos.clear();
        return;
    }
    RepeatReadInfo repeat_read_info(file_io_info.path_, file_io_info.java_context_.stack_, file_io_info.java_context_.thread_id_,
                                  file_io_info.op_size_, file_io_info.file_size_);
    if (repeat_infos.size() == 0) {
        repeat_infos.push_back(repeat_read_info);
        return;
    }
    if((GetTickCount() - repeat_infos[repeat_infos.size() - 1].op_timems) > 17) {   //17ms todo astrozhou add to params
        repeat_infos.clear();
    }
    bool found = false;
    int repeatCnt;
    for (auto& info : repeat_infos) {
        if (info == repeat_read_info) {
            found = true;
            info.IncRepeatReadCount();
            repeatCnt = info.GetRepeatReadCount();
            break;
        }
    }
    if (!found) {
        repeat_infos.push_back(repeat_read_info);
        return;
    }
    if (repeatCnt >= env.GetRepeatReadThreshold()) {
        Issue issue(kType, file_io_info);
        issue.repeat_read_cnt_ = repeatCnt;
        issue.stack = repeat_read_info.GetStack();
        PublishIssue(issue, issues);
    }
}

Buffer 过小

Buffer 过小检测会在读写操作次数超过阈值,且平均操作大小小于设定的阈值时上报,此时说明 Buffer 过小。

void FileIOSmallBufferDetector::Detect(const IOCanaryEnv &env, const IOInfo &file_io_info,
                                       std::vector<Issue>& issues) {
    if (file_io_info.op_cnt_ > env.kSmallBufferOpTimesThreshold && (file_io_info.op_size_ / file_io_info.op_cnt_) < env.GetSmallBufferThreshold()
            && file_io_info.max_continual_rw_cost_time_μs_ >= env.kPossibleNegativeThreshold) {
        PublishIssue(Issue(kType, file_io_info), issues);
    }
}

资源泄漏检测

资源泄漏检测(资源未 close)是通过对 SDK 自带的 CloseGuard 进行 hook 实现的。它通过反射 hook 了 CloseGuard 的 setReporter 方法,从而在其进行 report 时进行拦截并上报。

private boolean tryHook() {
    try {
        Class<?> closeGuardCls = Class.forName("dalvik.system.CloseGuard");
        Class<?> closeGuardReporterCls = Class.forName("dalvik.system.CloseGuard$Reporter");
        Method methodGetReporter = closeGuardCls.getDeclaredMethod("getReporter");
        Method methodSetReporter = closeGuardCls.getDeclaredMethod("setReporter", closeGuardReporterCls);
        Method methodSetEnabled = closeGuardCls.getDeclaredMethod("setEnabled", boolean.class);
        sOriginalReporter = methodGetReporter.invoke(null);
        methodSetEnabled.invoke(null, true);
        // open matrix close guard also
        MatrixCloseGuard.setEnabled(true);
        ClassLoader classLoader = closeGuardReporterCls.getClassLoader();
        if (classLoader == null) {
            return false;
        }
        methodSetReporter.invoke(null, Proxy.newProxyInstance(classLoader,
            new Class<?>[]{closeGuardReporterCls},
            new IOCloseLeakDetector(issueListener, sOriginalReporter)));
        return true;
    } catch (Throwable e) {
        MatrixLog.e(TAG, "tryHook exp=%s", e);
    }
    return false;
}

Resource-Canary 模块

Resource-Canary 主要负责对如内存泄漏、重复 Bitmap 等资源使用不当情况进行检测与分析。它分为了两个阶段:线上检测阶段与线下分析阶段,分别对应了 matrix-resource-canary-androidmatrix-resource-canary-analyzer 两个包。

线上检测

matrix-resource-canary-android 主要负责了线上的检测功能。

Activity 内存泄漏检测

对于 Activity 内存泄漏检测,其核心思想与 LeakCanary 类似,不过在对 GC 的处理方面有所不同。在整个 ResourcePlugin.start 时,它会调用创建的 ActivityRefWatcher.start,其中就会向 Application 注册用于监听 onDestroy 的生命周期回调,之后会调用 scheduleDetectProcedure 方法启动检测。

@Override
public void start() {
    stopDetect();
    final Application app = mResourcePlugin.getApplication();
    if (app != null) {
        app.registerActivityLifecycleCallbacks(mRemovedActivityMonitor);
        AppActiveMatrixDelegate.INSTANCE.addListener(this);
        scheduleDetectProcedure();
        MatrixLog.i(TAG, "watcher is started.");
    }
}

其中该生命周期回调会在 onDestroy 的时候调用 pushDestroyedActivityInfo 方法

private final Application.ActivityLifecycleCallbacks mRemovedActivityMonitor = new ActivityLifeCycleCallbacksAdapter() {
    @Override
    public void onActivityDestroyed(Activity activity) {
        pushDestroyedActivityInfo(activity);
    }
}

它主要做的事就是构造 Activity 对应的 DestroyedActivityInfo 并放入 mDestroyedActivityInfos 容器中。

private void pushDestroyedActivityInfo(Activity activity) {
    final String activityName = activity.getClass().getName();
    if (!mResourcePlugin.getConfig().getDetectDebugger() && isPublished(activityName)) {
        MatrixLog.i(TAG, "activity leak with name %s had published, just ignore", activityName);
        return;
    }
    final UUID uuid = UUID.randomUUID();
    final StringBuilder keyBuilder = new StringBuilder();
    keyBuilder.append(ACTIVITY_REFKEY_PREFIX).append(activityName)
        .append('_').append(Long.toHexString(uuid.getMostSignificantBits())).append(Long.toHexString(uuid.getLeastSignificantBits()));
    final String key = keyBuilder.toString();
    final DestroyedActivityInfo destroyedActivityInfo
        = new DestroyedActivityInfo(key, activity, activityName);
    mDestroyedActivityInfos.add(destroyedActivityInfo);
}

之后在 scheduleDetectProcedure 方法中通过线程池执行了 mScanDestroyedActivitiesTask

private void scheduleDetectProcedure() {
    mDetectExecutor.executeInBackground(mScanDestroyedActivitiesTask);
}

它是一个 RetryableTask,而执行它的 mDetectExecutor 是一个 RetryableTaskExecutor,它会根据 Task 执行的返回值决定是否需要重试。

它会在开始运行时构建一个 sentinelRef 哨兵 Object 的弱引用,之后调用 triggerGc 方法建议 JVM 进行 GC。之后对 sentinelRef 指向的对象进行判空,若其指向的对象不为空,则说明 GC 还没有发生,则返回 RETRY 进行重试。

通过上面的方法可以保证发生了 GC 后才会执行到下面的代码,它就会遍历 mDestroyedActivityInfos,找到其中不为 null 的 Activity,也就代表该 Activity 没有被回收,说明发生了内存泄漏。

在检测到发生了内存泄漏后,它会根据设置的 DumpHprof 策略来决定是否 dump 以及如何 dump hprof 文件,之后进行上报。

private final RetryableTask mScanDestroyedActivitiesTask = new RetryableTask() {
    @Override
    public Status execute() {
        // If destroyed activity list is empty, just wait to save power.
        if (mDestroyedActivityInfos.isEmpty()) {
            MatrixLog.i(TAG, "DestroyedActivityInfo isEmpty!");
            return Status.RETRY;
        }
        // Fake leaks will be generated when debugger is attached.
        if (Debug.isDebuggerConnected() && !mResourcePlugin.getConfig().getDetectDebugger()) {
            MatrixLog.w(TAG, "debugger is connected, to avoid fake result, detection was delayed.");
            return Status.RETRY;
        }
        final WeakReference<Object> sentinelRef = new WeakReference<>(new Object());
        triggerGc();
        if (sentinelRef.get() != null) {
            // System ignored our gc request, we will retry later.
            MatrixLog.d(TAG, "system ignore our gc request, wait for next detection.");
            return Status.RETRY;
        }
        final Iterator<DestroyedActivityInfo> infoIt = mDestroyedActivityInfos.iterator();
        while (infoIt.hasNext()) {
            final DestroyedActivityInfo destroyedActivityInfo = infoIt.next();
            if (!mResourcePlugin.getConfig().getDetectDebugger() && isPublished(destroyedActivityInfo.mActivityName) && mDumpHprofMode != ResourceConfig.DumpMode.SILENCE_DUMP) {
                MatrixLog.v(TAG, "activity with key [%s] was already published.", destroyedActivityInfo.mActivityName);
                infoIt.remove();
                continue;
            }
            if (destroyedActivityInfo.mActivityRef.get() == null) {
                // The activity was recycled by a gc triggered outside.
                MatrixLog.v(TAG, "activity with key [%s] was already recycled.", destroyedActivityInfo.mKey);
                infoIt.remove();
                continue;
            }
            ++destroyedActivityInfo.mDetectedCount;
            if (destroyedActivityInfo.mDetectedCount < mMaxRedetectTimes
                || !mResourcePlugin.getConfig().getDetectDebugger()) {
                // Although the sentinel tell us the activity should have been recycled,
                // system may still ignore it, so try again until we reach max retry times.
                MatrixLog.i(TAG, "activity with key [%s] should be recycled but actually still \n"
                        + "exists in %s times, wait for next detection to confirm.",
                    destroyedActivityInfo.mKey, destroyedActivityInfo.mDetectedCount);
                continue;
            }
            MatrixLog.i(TAG, "activity with key [%s] was suspected to be a leaked instance. mode[%s]", destroyedActivityInfo.mKey, mDumpHprofMode);
            if (mDumpHprofMode == ResourceConfig.DumpMode.SILENCE_DUMP) {
                if (mResourcePlugin != null && !isPublished(destroyedActivityInfo.mActivityName)) {
                    final JSONObject resultJson = new JSONObject();
                    try {
                        resultJson.put(SharePluginInfo.ISSUE_ACTIVITY_NAME, destroyedActivityInfo.mActivityName);
                    } catch (JSONException e) {
                        MatrixLog.printErrStackTrace(TAG, e, "unexpected exception.");
                    }
                    mResourcePlugin.onDetectIssue(new Issue(resultJson));
                }
                if (null != activityLeakCallback) {
                    activityLeakCallback.onLeak(destroyedActivityInfo.mActivityName, destroyedActivityInfo.mKey);
                }
            } else if (mDumpHprofMode == ResourceConfig.DumpMode.AUTO_DUMP) {
                final File hprofFile = mHeapDumper.dumpHeap();
                if (hprofFile != null) {
                    markPublished(destroyedActivityInfo.mActivityName);
                    final HeapDump heapDump = new HeapDump(hprofFile, destroyedActivityInfo.mKey, destroyedActivityInfo.mActivityName);
                    mHeapDumpHandler.process(heapDump);
                    infoIt.remove();
                } else {
                    MatrixLog.i(TAG, "heap dump for further analyzing activity with key [%s] was failed, just ignore.",
                            destroyedActivityInfo.mKey);
                    infoIt.remove();
                }
            } else if (mDumpHprofMode == ResourceConfig.DumpMode.MANUAL_DUMP) {
                NotificationManager notificationManager = (NotificationManager) context.getSystemService(Context.NOTIFICATION_SERVICE);
                String dumpingHeapContent = context.getString(R.string.resource_canary_leak_tip);
                String dumpingHeapTitle = destroyedActivityInfo.mActivityName;
                mContentIntent.putExtra(SharePluginInfo.ISSUE_ACTIVITY_NAME, destroyedActivityInfo.mActivityName);
                mContentIntent.putExtra(SharePluginInfo.ISSUE_REF_KEY, destroyedActivityInfo.mKey);
                PendingIntent pIntent = PendingIntent.getActivity(context, 0, mContentIntent,
                        PendingIntent.FLAG_UPDATE_CURRENT);
                NotificationCompat.Builder builder = new NotificationCompat.Builder(context)
                        .setContentTitle(dumpingHeapTitle)
                        .setContentIntent(pIntent)
                        .setContentText(dumpingHeapContent);
                Notification notification = buildNotification(context, builder);
                notificationManager.notify(NOTIFICATION_ID, notification);
                infoIt.remove();
                markPublished(destroyedActivityInfo.mActivityName);
                MatrixLog.i(TAG, "show notification for notify activity leak. %s", destroyedActivityInfo.mActivityName);
            } else {
                // Lightweight mode, just report leaked activity name.
                MatrixLog.i(TAG, "lightweight mode, just report leaked activity name.");
                markPublished(destroyedActivityInfo.mActivityName);
                if (mResourcePlugin != null) {
                    final JSONObject resultJson = new JSONObject();
                    try {
                        resultJson.put(SharePluginInfo.ISSUE_ACTIVITY_NAME, destroyedActivityInfo.mActivityName);
                    } catch (JSONException e) {
                        MatrixLog.printErrStackTrace(TAG, e, "unexpected exception.");
                    }
                    mResourcePlugin.onDetectIssue(new Issue(resultJson));
                }
            }
        }
        return Status.RETRY;
    }
};

Hprof 裁剪

在 Hprof dump 下来之后,为了减少上报的文件大小提高上报成功率,会对 Hprof 进行裁剪。我们可以看到上面检测上报的 AUTO_DUMP 模式的部分代码。

这里先通过 mHeapDumper.dumpHeap 生成了对应的 Hprof 文件,之后通过 mHeapDumpHandler 对其进行处理。

if (mDumpHprofMode == ResourceConfig.DumpMode.AUTO_DUMP) {
    final File hprofFile = mHeapDumper.dumpHeap();
    if (hprofFile != null) {
        markPublished(destroyedActivityInfo.mActivityName);
        final HeapDump heapDump = new HeapDump(hprofFile, destroyedActivityInfo.mKey, destroyedActivityInfo.mActivityName);
        mHeapDumpHandler.process(heapDump);
        infoIt.remove();
    } else {
        MatrixLog.i(TAG, "heap dump for further analyzing activity with key [%s] was failed, just ignore.",
                destroyedActivityInfo.mKey);
        infoIt.remove();
    }
}

它会调用 CanaryWorkerService.shrinkHprofAndReport 方法对 Hprof 文件进行裁剪,然后进行上报。

protected AndroidHeapDumper.HeapDumpHandler createHeapDumpHandler(final Context context, ResourceConfig resourceConfig) {
    return new AndroidHeapDumper.HeapDumpHandler() {
        @Override
        public void process(HeapDump result) {
            CanaryWorkerService.shrinkHprofAndReport(context, result);
        }
    };
}

这里会首先通过 HprofBufferShrinker.shrink 对其先进行裁剪,然后将其进行压缩后进行上传。

private void doShrinkHprofAndReport(HeapDump heapDump) {
    final File hprofDir = heapDump.getHprofFile().getParentFile();
    final File shrinkedHProfFile = new File(hprofDir, getShrinkHprofName(heapDump.getHprofFile()));
    final File zipResFile = new File(hprofDir, getResultZipName("dump_result_" + android.os.Process.myPid()));
    final File hprofFile = heapDump.getHprofFile();
    ZipOutputStream zos = null;
    try {
        long startTime = System.currentTimeMillis();
        new HprofBufferShrinker().shrink(hprofFile, shrinkedHProfFile);
        MatrixLog.i(TAG, "shrink hprof file %s, size: %dk to %s, size: %dk, use time:%d",
                hprofFile.getPath(), hprofFile.length() / 1024, shrinkedHProfFile.getPath(), shrinkedHProfFile.length() / 1024, (System.currentTimeMillis() - startTime));
        zos = new ZipOutputStream(new BufferedOutputStream(new FileOutputStream(zipResFile)));
        final ZipEntry resultInfoEntry = new ZipEntry("result.info");
        final ZipEntry shrinkedHProfEntry = new ZipEntry(shrinkedHProfFile.getName());
        zos.putNextEntry(resultInfoEntry);
        final PrintWriter pw = new PrintWriter(new OutputStreamWriter(zos, Charset.forName("UTF-8")));
        pw.println("# Resource Canary Result Infomation. THIS FILE IS IMPORTANT FOR THE ANALYZER !!");
        pw.println("sdkVersion=" + Build.VERSION.SDK_INT);
        pw.println("manufacturer=" + Build.MANUFACTURER);
        pw.println("hprofEntry=" + shrinkedHProfEntry.getName());
        pw.println("leakedActivityKey=" + heapDump.getReferenceKey());
        pw.flush();
        zos.closeEntry();
        zos.putNextEntry(shrinkedHProfEntry);
        copyFileToStream(shrinkedHProfFile, zos);
        zos.closeEntry();
        shrinkedHProfFile.delete();
        hprofFile.delete();
        MatrixLog.i(TAG, "process hprof file use total time:%d", (System.currentTimeMillis() - startTime));
        CanaryResultService.reportHprofResult(this, zipResFile.getAbsolutePath(), heapDump.getActivityName());
    } catch (IOException e) {
        MatrixLog.printErrStackTrace(TAG, e, "");
    } finally {
        closeQuietly(zos);
    }
}

裁剪的过程会通过三个 HprofVisitor 每次从头开始分别进行处理,从而完成裁剪。

public void shrink(File hprofIn, File hprofOut) throws IOException {
    FileInputStream is = null;
    OutputStream os = null;
    try {
        is = new FileInputStream(hprofIn);
        os = new BufferedOutputStream(new FileOutputStream(hprofOut));
        final HprofReader reader = new HprofReader(new BufferedInputStream(is));
        reader.accept(new HprofInfoCollectVisitor());
        // Reset.
        is.getChannel().position(0);
        reader.accept(new HprofKeptBufferCollectVisitor());
        // Reset.
        is.getChannel().position(0);
        reader.accept(new HprofBufferShrinkVisitor(new HprofWriter(os)));
    } finally {
        if (os != null) {
            try {
                os.close();
            } catch (Throwable thr) {
                // Ignored.
            }
        }
        if (is != null) {
            try {
                is.close();
            } catch (Throwable thr) {
                // Ignored.
            }
        }
    }
}
HprofInfoCollectVistor

HprofVisitor 主要是完成信息的收集,可以看到它主要是对如 mBitmapClassNameStringIdmBmpClassIdmBmpClassInstanceFields 之类的信息进行收集。

private class HprofInfoCollectVisitor extends HprofVisitor {
    HprofInfoCollectVisitor() {
        super(null);
    }
    @Override
    public void visitHeader(String text, int idSize, long timestamp) {
        mIdSize = idSize;
        mNullBufferId = ID.createNullID(idSize);
    }
    @Override
    public void visitStringRecord(ID id, String text, int timestamp, long length) {
        if (mBitmapClassNameStringId == null && "android.graphics.Bitmap".equals(text)) {
            mBitmapClassNameStringId = id;
        } else if (mMBufferFieldNameStringId == null && "mBuffer".equals(text)) {
            mMBufferFieldNameStringId = id;
        } else if (mMRecycledFieldNameStringId == null && "mRecycled".equals(text)) {
            mMRecycledFieldNameStringId = id;
        } else if (mStringClassNameStringId == null && "java.lang.String".equals(text)) {
            mStringClassNameStringId = id;
        } else if (mValueFieldNameStringId == null && "value".equals(text)) {
            mValueFieldNameStringId = id;
        }
    }
    @Override
    public void visitLoadClassRecord(int serialNumber, ID classObjectId, int stackTraceSerial, ID classNameStringId, int timestamp, long length) {
        if (mBmpClassId == null && mBitmapClassNameStringId != null && mBitmapClassNameStringId.equals(classNameStringId)) {
            mBmpClassId = classObjectId;
        } else if (mStringClassId == null && mStringClassNameStringId != null && mStringClassNameStringId.equals(classNameStringId)) {
            mStringClassId = classObjectId;
        }
    }
    @Override
    public HprofHeapDumpVisitor visitHeapDumpRecord(int tag, int timestamp, long length) {
        return new HprofHeapDumpVisitor(null) {
            @Override
            public void visitHeapDumpClass(ID id, int stackSerialNumber, ID superClassId, ID classLoaderId, int instanceSize, Field[] staticFields, Field[] instanceFields) {
                if (mBmpClassInstanceFields == null && mBmpClassId != null && mBmpClassId.equals(id)) {
                    mBmpClassInstanceFields = instanceFields;
                } else if (mStringClassInstanceFields == null && mStringClassId != null && mStringClassId.equals(id)) {
                    mStringClassInstanceFields = instanceFields;
                }
            }
        };
    }
}
HprofKeptBufferCollectVisitor

visitHeapDumpInstance 方法中主要是对如 mBufferbufferId、名为 valuestrValueId 等进行收集。

visitHeapDumpPrimitiveArray 方法中会将 id 及其对应的 elements 数组放入 mBufferIdToElementDataMap

visitEnd 方法中则会遍历 mBufferIdToElementDataMap,并对对应的 elements 进行 MD5 Hash,根据其是否存在于 duplicateBufferFilterMap 即可确定其数据是否重复,若出现了重复会将其记录到 mBmpBufferIdToDeduplicatedIdMap 中,供之后进行处理。

private class HprofKeptBufferCollectVisitor extends HprofVisitor {
    HprofKeptBufferCollectVisitor() {
        super(null);
    }
    @Override
    public HprofHeapDumpVisitor visitHeapDumpRecord(int tag, int timestamp, long length) {
        return new HprofHeapDumpVisitor(null) {
            @Override
            public void visitHeapDumpInstance(ID id, int stackId, ID typeId, byte[] instanceData) {
                try {
                    if (mBmpClassId != null && mBmpClassId.equals(typeId)) {
                        ID bufferId = null;
                        Boolean isRecycled = null;
                        final ByteArrayInputStream bais = new ByteArrayInputStream(instanceData);
                        for (Field field : mBmpClassInstanceFields) {
                            final ID fieldNameStringId = field.nameId;
                            final Type fieldType = Type.getType(field.typeId);
                            if (fieldType == null) {
                                throw new IllegalStateException("visit bmp instance failed, lost type def of typeId: " + field.typeId);
                            }
                            if (mMBufferFieldNameStringId.equals(fieldNameStringId)) {
                                bufferId = (ID) IOUtil.readValue(bais, fieldType, mIdSize);
                            } else if (mMRecycledFieldNameStringId.equals(fieldNameStringId)) {
                                isRecycled = (Boolean) IOUtil.readValue(bais, fieldType, mIdSize);
                            } else if (bufferId == null || isRecycled == null) {
                                IOUtil.skipValue(bais, fieldType, mIdSize);
                            } else {
                                break;
                            }
                        }
                        bais.close();
                        final boolean reguardAsNotRecycledBmp = (isRecycled == null || !isRecycled);
                        if (bufferId != null && reguardAsNotRecycledBmp && !bufferId.equals(mNullBufferId)) {
                            mBmpBufferIds.add(bufferId);
                        }
                    } else if (mStringClassId != null && mStringClassId.equals(typeId)) {
                        ID strValueId = null;
                        final ByteArrayInputStream bais = new ByteArrayInputStream(instanceData);
                        for (Field field : mStringClassInstanceFields) {
                            final ID fieldNameStringId = field.nameId;
                            final Type fieldType = Type.getType(field.typeId);
                            if (fieldType == null) {
                                throw new IllegalStateException("visit string instance failed, lost type def of typeId: " + field.typeId);
                            }
                            if (mValueFieldNameStringId.equals(fieldNameStringId)) {
                                strValueId = (ID) IOUtil.readValue(bais, fieldType, mIdSize);
                            } else if (strValueId == null) {
                                IOUtil.skipValue(bais, fieldType, mIdSize);
                            } else {
                                break;
                            }
                        }
                        bais.close();
                        if (strValueId != null && !strValueId.equals(mNullBufferId)) {
                            mStringValueIds.add(strValueId);
                        }
                    }
                } catch (Throwable thr) {
                    throw new RuntimeException(thr);
                }
            }
            @Override
            public void visitHeapDumpPrimitiveArray(int tag, ID id, int stackId, int numElements, int typeId, byte[] elements) {
                mBufferIdToElementDataMap.put(id, elements);
            }
        };
    }
    @Override
    public void visitEnd() {
        final Set<Map.Entry<ID, byte[]>> idDataSet = mBufferIdToElementDataMap.entrySet();
        final Map<String, ID> duplicateBufferFilterMap = new HashMap<>();
        for (Map.Entry<ID, byte[]> idDataPair : idDataSet) {
            final ID bufferId = idDataPair.getKey();
            final byte[] elementData = idDataPair.getValue();
            if (!mBmpBufferIds.contains(bufferId)) {
                // Discard non-bitmap buffer.
                continue;
            }
            final String buffMd5 = DigestUtil.getMD5String(elementData);
            final ID mergedBufferId = duplicateBufferFilterMap.get(buffMd5);
            if (mergedBufferId == null) {
                duplicateBufferFilterMap.put(buffMd5, bufferId);
            } else {
                mBmpBufferIdToDeduplicatedIdMap.put(mergedBufferId, mergedBufferId);
                mBmpBufferIdToDeduplicatedIdMap.put(bufferId, mergedBufferId);
            }
        }
        // Save memory cost.
        mBufferIdToElementDataMap.clear();
    }
}
HprofBufferShrinkVisitor

该 Vistor 比较特殊,在创建时需要传入一个 HprofWriter,通过它来进行写入数据到新的 shrinkedHprofFile。具体写入原理是在我们每次调用 super.xxx 时,它都会调用到 HprofWriter.xxx,从而将对应数据进行写入。

visitHeapDumpInstance 方法中,它会获取到对应的 bufferId,然后寻找其是否有重复的 buffer,如果有会调用 modifyIdInBuffer 方法,它会将对应位置的 bufferId 替换为重复的那个 Id。

visitHeapDumpPrimitiveArray 中由于我们已经完成了 bufferId 的计算,它只会将之前的 elements 数据进行写入,重复的则不再写入。这样就使得 hprof 文件中我们原本多个重复的 Bitmap buffer 占用的空间得到了裁剪。

private class HprofBufferShrinkVisitor extends HprofVisitor {
    HprofBufferShrinkVisitor(HprofWriter hprofWriter) {
        super(hprofWriter);
    }

    @Override
    public HprofHeapDumpVisitor visitHeapDumpRecord(int tag, int timestamp, long length) {
        return new HprofHeapDumpVisitor(super.visitHeapDumpRecord(tag, timestamp, length)) {
            @Override
            public void visitHeapDumpInstance(ID id, int stackId, ID typeId, byte[] instanceData) {
                try {
                    if (typeId.equals(mBmpClassId)) {
                        ID bufferId = null;
                        int bufferIdPos = 0;
                        final ByteArrayInputStream bais = new ByteArrayInputStream(instanceData);
                        for (Field field : mBmpClassInstanceFields) {
                            final ID fieldNameStringId = field.nameId;
                            final Type fieldType = Type.getType(field.typeId);
                            if (fieldType == null) {
                                throw new IllegalStateException("visit instance failed, lost type def of typeId: " + field.typeId);
                            }
                            if (mMBufferFieldNameStringId.equals(fieldNameStringId)) {
                                bufferId = (ID) IOUtil.readValue(bais, fieldType, mIdSize);
                                break;
                            } else {
                                bufferIdPos += IOUtil.skipValue(bais, fieldType, mIdSize);
                            }
                        }
                        if (bufferId != null) {
                            final ID deduplicatedId = mBmpBufferIdToDeduplicatedIdMap.get(bufferId);
                            if (deduplicatedId != null && !bufferId.equals(deduplicatedId) && !bufferId.equals(mNullBufferId)) {
                                modifyIdInBuffer(instanceData, bufferIdPos, deduplicatedId);
                            }
                        }
                    }
                } catch (Throwable thr) {
                    throw new RuntimeException(thr);
                }
                super.visitHeapDumpInstance(id, stackId, typeId, instanceData);
            }

            private void modifyIdInBuffer(byte[] buf, int off, ID newId) {
                final ByteBuffer bBuf = ByteBuffer.wrap(buf);
                bBuf.position(off);
                bBuf.put(newId.getBytes());
            }

            @Override
            public void visitHeapDumpPrimitiveArray(int tag, ID id, int stackId, int numElements, int typeId, byte[] elements) {
                final ID deduplicatedID = mBmpBufferIdToDeduplicatedIdMap.get(id);
                // Discard non-bitmap or duplicated bitmap buffer but keep reference key.
                if (deduplicatedID == null || !id.equals(deduplicatedID)) {
                    if (!mStringValueIds.contains(id)) {
                        return;
                    }
                }
                super.visitHeapDumpPrimitiveArray(tag, id, stackId, numElements, typeId, elements);
            }
        };
    }
}

线下分析

matrix-resource-canary-analyzer 主要负责了线下分析的功能,它是一个命令行工具。

例如前面的 Activity 内存泄漏时,我们并没有去获取它对应的泄漏链,因为这个过程比较耗时,Matrix 将其放到了线下进行分析,通过上报的 hprof 文件即可分析出该 Activity 的泄漏链。

Activity 内存泄漏链分析

它会在运行时找到我们传入的泄漏的 Activity 的 refKey,之后调用 findLeakingReference 找到该 Activity 对应的 Instance,之后它会调用 findLeakTrace 找出整条泄漏的引用链。

 private ActivityLeakResult checkForLeak(HeapSnapshot heapSnapshot, String refKey) {
     long analysisStartNanoTime = System.nanoTime();
     try {
         final Snapshot snapshot = heapSnapshot.getSnapshot();
         final Instance leakingRef = findLeakingReference(refKey, snapshot);
         // False alarm, weak reference was cleared in between key check and heap dump.
         if (leakingRef == null) {
             return ActivityLeakResult.noLeak(AnalyzeUtil.since(analysisStartNanoTime));
         }
         return findLeakTrace(analysisStartNanoTime, snapshot, leakingRef);
     } catch (Throwable e) {
         e.printStackTrace();
         return ActivityLeakResult.failure(e, AnalyzeUtil.since(analysisStartNanoTime));
     }
 }

之后会通过 ShorestPathFinder 进行引用链的查找,最后根据引用链是否为 empty 判断其是否真正发生了内存泄漏,若发生了则会进行上报。

private ActivityLeakResult findLeakTrace(long analysisStartNanoTime, Snapshot snapshot
                                     Instance leakingRef) {
    ShortestPathFinder pathFinder = new ShortestPathFinder(mExcludedRefs);
    ShortestPathFinder.Result result = pathFinder.findPath(snapshot, leakingRef);
    // False alarm, no strong reference path to GC Roots.
    if (result.referenceChainHead == null) {
        return ActivityLeakResult.noLeak(AnalyzeUtil.since(analysisStartNanoTime));
    }
    final ReferenceChain referenceChain = result.buildReferenceChain();
    final String className = leakingRef.getClassObj().getClassName();
    if (result.excludingKnown || referenceChain.isEmpty()) {
        return ActivityLeakResult.noLeak(AnalyzeUtil.since(analysisStartNanoTime));
    } else {
        return ActivityLeakResult.leakDetected(false, className, referenceChain,
                AnalyzeUtil.since(analysisStartNanoTime));
    }
}

ShorestPathFinder.findPath 中,它能够返回提供的 Instance 列表中每个 Instance 的引用链。

它会首先调用 enqueueGcRoots 将所有 GcRoot 加入容器,然后在下面的 while 循环中不断从 toVisitQueuetoVisitIfNoPathQueue 中取出引用节点,并向下查找,当所有需要查找引用链的引用都已查找完后,便代表整个引用链查找结束了,之后便可以返回结果了。

个人认为这样相比通过 Haha 不断 moveToParent 的方式,可以减少计算量,因为多个 Instance 中可能存在重叠的调用链。

public Map<Instance, Result> findPath(Snapshot snapshot, Collection<Instance> targetReferences) {
    final Map<Instance, Result> results = new HashMap<>();
    if (targetReferences.isEmpty()) {
        return results;
    }
    clearState();
    enqueueGcRoots(snapshot);
    canIgnoreStrings = true;
    for (Instance targetReference : targetReferences) {
        if (isString(targetReference)) {
            canIgnoreStrings = false;
            break;
        }
    }
    final Set<Instance> targetRefSet = new HashSet<>(targetReferences);
    while (!toVisitQueue.isEmpty() || !toVisitIfNoPathQueue.isEmpty()) {
        ReferenceNode node;
        if (!toVisitQueue.isEmpty()) {
            node = toVisitQueue.poll();
        } else {
            node = toVisitIfNoPathQueue.poll();
            if (node.exclusion == null) {
                throw new IllegalStateException("Expected node to have an exclusion " + node);
            }
        }
        // Termination
        if (targetRefSet.contains(node.instance)) {
            results.put(node.instance, new Result(node, node.exclusion != null));
            targetRefSet.remove(node.instance);
            if (targetRefSet.isEmpty()) {
                break;
            }
        }
        if (checkSeen(node)) {
            continue;
        }
        if (node.instance instanceof RootObj) {
            visitRootObj(node);
        } else if (node.instance instanceof ClassObj) {
            visitClassObj(node);
        } else if (node.instance instanceof ClassInstance) {
            visitClassInstance(node);
        } else if (node.instance instanceof ArrayInstance) {
            visitArrayInstance(node);
        } else {
            throw new IllegalStateException("Unexpected type for " + node.instance);
        }
    }
    return results;
}

Bitmap 重复分析

对于 Bitmap 重复分析,它会首先通过 ShortestDistanceVisitor.doVisit 从 GCRoots 开始遍历每个引用节点,然后通过 findDuplicatedBitmap 对重复 Bitmap 进行分析。

@Override
public DuplicatedBitmapResult analyze(HeapSnapshot heapSnapshot) {
    final long analysisStartNanoTime = System.nanoTime();
    try {
        final Snapshot snapshot = heapSnapshot.getSnapshot();
        new ShortestDistanceVisitor().doVisit(snapshot.getGCRoots());
        return findDuplicatedBitmap(analysisStartNanoTime, snapshot);
    } catch (Throwable e) {
        e.printStackTrace();
        return DuplicatedBitmapResult.failure(e, AnalyzeUtil.since(analysisStartNanoTime));
    }
}

这里首先获取到了所有可达的 Bitmap 对象,然后获取到了它们的 buffer,对于大于设定阈值的大 Bitmap 会将其放入 byteArrayToBitmapMap 这一 map,并将所有 buffer 存入 byteArrays,需要注意的是,存在重复的 buffer 时会拷贝一份一样的 buffer,这就还原了我们裁剪前的状态。

之后会进行重复 Bitmap 的寻找,若找到了重复的 Bitmap 则通过 ShorestPathFinder 寻找最短引用链并上报。

private DuplicatedBitmapResult findDuplicatedBitmap(long analysisStartNanoTime, Snapshot snapshot) {
    final ClassObj bitmapClass = snapshot.findClass("android.graphics.Bitmap");
    if (bitmapClass == null) {
        return DuplicatedBitmapResult.noDuplicatedBitmap(AnalyzeUtil.since(analysisStartNanoTime));
    }
    final Map<ArrayInstance, Instance> byteArrayToBitmapMap = new HashMap<>();
    final Set<ArrayInstance> byteArrays = new HashSet<>();
    final List<Instance> reachableInstances = new ArrayList<>();
    for (Heap heap : snapshot.getHeaps()) {
        if (!"default".equals(heap.getName()) && !"app".equals(heap.getName())) {
            continue;
        }
        final List<Instance> bitmapInstances = bitmapClass.getHeapInstances(heap.getId());
        for (Instance bitmapInstance : bitmapInstances) {
            if (bitmapInstance.getDistanceToGcRoot() == Integer.MAX_VALUE) {
                continue;
            }
            reachableInstances.add(bitmapInstance);
        }
        for (Instance bitmapInstance : reachableInstances) {
            ArrayInstance buffer = HahaHelper.fieldValue(((ClassInstance) bitmapInstance).getValues(), "mBuffer");
            if (buffer != null) {
                // sizeof(byte) * bufferLength -> bufferSize
                final int bufferSize = buffer.getSize();
                if (bufferSize < mMinBmpLeakSize) {
                    // Ignore tiny bmp leaks.
                    System.out.println(" + Skiped a bitmap with size: " + bufferSize);
                    continue;
                }
                if (byteArrayToBitmapMap.containsKey(buffer)) {
                    buffer = cloneArrayInstance(buffer);
                }
                byteArrayToBitmapMap.put(buffer, bitmapInstance);
            } else {
                System.out.println(" + Skiped a no-data bitmap");
            }
        }
        byteArrays.addAll(byteArrayToBitmapMap.keySet());
    }
    if (byteArrays.size() <= 1) {
        return DuplicatedBitmapResult.noDuplicatedBitmap(AnalyzeUtil.since(analysisStartNanoTime));
    }
    final List<DuplicatedBitmapEntry> duplicatedBitmapEntries = new ArrayList<>();
    final List<Set<ArrayInstance>> commonPrefixSets = new ArrayList<>();
    final List<Set<ArrayInstance>> reducedPrefixSets = new ArrayList<>();
    commonPrefixSets.add(byteArrays);
    // Cache the values since instance.getValues() recreates the array on every invocation.
    final Map<ArrayInstance, Object[]> cachedValues = new HashMap<>();
    for (ArrayInstance instance : byteArrays) {
        cachedValues.put(instance, instance.getValues());
    }
    int columnIndex = 0;
    while (!commonPrefixSets.isEmpty()) {
        for (Set<ArrayInstance> commonPrefixArrays : commonPrefixSets) {
            Map<Object, Set<ArrayInstance>> entryClassifier = new HashMap<>(
                    commonPrefixArrays.size());
            for (ArrayInstance arrayInstance : commonPrefixArrays) {
                final Object element = cachedValues.get(arrayInstance)[columnIndex];
                if (entryClassifier.containsKey(element)) {
                    entryClassifier.get(element).add(arrayInstance);
                } else {
                    Set<ArrayInstance> instanceSet = new HashSet<>();
                    instanceSet.add(arrayInstance);
                    entryClassifier.put(element, instanceSet);
                }
            }
            for (Set<ArrayInstance> branch : entryClassifier.values()) {
                if (branch.size() <= 1) {
                    // Unique branch, ignore it and it won't be counted towards duplication.
                    continue;
                }
                final Set<ArrayInstance> terminatedArrays = new HashSet<>();
                // Move all ArrayInstance that we have hit the end of to the candidate result list.
                for (ArrayInstance instance : branch) {
                    if (HahaHelper.getArrayInstanceLength(instance) == columnIndex + 1) {
                        terminatedArrays.add(instance);
                    }
                }
                branch.removeAll(terminatedArrays);
                // Exact duplicated arrays found.
                if (terminatedArrays.size() > 1) {
                    byte[] rawBuffer = null;
                    int width = 0;
                    int height = 0;
                    final List<Instance> duplicateBitmaps = new ArrayList<>();
                    for (ArrayInstance terminatedArray : terminatedArrays) {
                        final Instance bmpInstance = byteArrayToBitmapMap.get(terminatedArray);
                        duplicateBitmaps.add(bmpInstance);
                        if (rawBuffer == null) {
                            final List<FieldValue> fieldValues = ((ClassInstance) bmpInstance).getValues();
                            width = HahaHelper.fieldValue(fieldValues, "mWidth");
                            height = HahaHelper.fieldValue(fieldValues, "mHeight");
                            final int byteArraySize = HahaHelper.getArrayInstanceLength(terminatedArray);
                            rawBuffer = HahaHelper.asRawByteArray(terminatedArray, 0, byteArraySize);
                        }
                    }
                    final Map<Instance, Result> results = new ShortestPathFinder(mExcludedBmps)
                            .findPath(snapshot, duplicateBitmaps);
                    final List<ReferenceChain> referenceChains = new ArrayList<>();
                    for (Result result : results.values()) {
                        if (result.excludingKnown) {
                            continue;
                        }
                        ReferenceNode currRefChainNode = result.referenceChainHead;
                        while (currRefChainNode.parent != null) {
                            final ReferenceNode tempNode = currRefChainNode.parent;
                            if (tempNode.instance == null) {
                                currRefChainNode = tempNode;
                                continue;
                            }
                            final Heap heap = tempNode.instance.getHeap();
                            if (heap != null && !"app".equals(heap.getName())) {
                                break;
                            } else {
                                currRefChainNode = tempNode;
                            }
                        }
                        final Instance gcRootHolder = currRefChainNode.instance;
                        if (!(gcRootHolder instanceof ClassObj)) {
                            continue;
                        }
                        final String holderClassName = ((ClassObj) gcRootHolder).getClassName();
                        boolean isExcluded = false;
                        for (ExcludedBmps.PatternInfo patternInfo : mExcludedBmps.mClassNamePatterns) {
                            if (!patternInfo.mForGCRootOnly) {
                                continue;
                            }
                            if (patternInfo.mPattern.matcher(holderClassName).matches()) {
                                System.out.println(" + Skipped a bitmap with gc root class: "
                                        + holderClassName + " by pattern: " + patternInfo.mPattern.toString());
                                isExcluded = true;
                                break;
                            }
                        }
                        if (!isExcluded) {
                            referenceChains.add(result.buildReferenceChain());
                        }
                    }
                    if (referenceChains.size() > 1) {
                        duplicatedBitmapEntries.add(new DuplicatedBitmapEntry(width, height, rawBuffer, referenceChains));
                    }
                }
                // If there are ArrayInstances that have identical prefixes and haven't hit the
                // end, add it back for the next iteration.
                if (branch.size() > 1) {
                    reducedPrefixSets.add(branch);
                }
            }
        }
        commonPrefixSets.clear();
        commonPrefixSets.addAll(reducedPrefixSets);
        reducedPrefixSets.clear();
        columnIndex++;
    }
    return DuplicatedBitmapResult.duplicatedBitmapDetected(duplicatedBitmapEntries, AnalyzeUtil.since(analysisStartNanoTime));
}
点赞

发表评论

电子邮件地址不会被公开。必填项已用 * 标注

%d 博主赞过: