Bit-stream recognition (BSR) has many applications, such as forensic
investigations, detection of copyright infringement, and malware analysis. We
propose the first BSR that takes a bare input bit-stream and outputs a class
label without any preprocessing. To achieve our goal, we propose a centrifuge
mechanism, where the upstream layers (sub-net) capture global features and tell
the downstream layers (main-net) to switch the focus, even if a part of the
input bit-stream has the same value. We applied the centrifuge mechanism to
compiler provenance recovery, a type of BSR, and achieved excellent
classification. Additionally, downstream transfer learning (DTL), one of the
learning methods we propose for the centrifuge mechanism, pre-trains the
main-net using the sub-net’s ground truth instead of the sub-net’s output. We
found that sub-predictions made by DTL tend to be highly accurate when the
sub-label classification contributes to the essence of the main prediction.

