在android系统中,不同App之间是依靠包名、数字签名共同来进行区分的。虽然Google建议我们用自己的域名的反写作为包名的前缀来定义包名(例如com.google.),但是这并不能做到万无一失,我们不能单单利用包名来区分apk,所以提出了签名的概念。顾名思义,就是在apk上打上作者的烙印。
先看如何签名。一般,在android代码中,build/target/product/security/目录下保存了类似platform.pk8,platform.x509.pem这样一对一对的数字证书和私钥。而在编译android代码时,从编译log中能看到类似下面的命令:
java -jar out/host/linux-x86/framework/signapk.jar build/target/product/security/platform.x509.pem build/target/product/security/platform.pk8 before_sign.apk signed.apk
这句话的意思是,调用signapk.jar这个命令,用platform.x509.pem,platform.pk8这两个文件对before_sign.apk 进行签名,签名完成后的apk为signed.apk,这个signed.apk已经有了我们自己的烙印有了签名了。
apk文件本身就是一个archive,是个类似于zip的文件,可以用解压软件直接解压。对比签名前后的apk,签名后的apk中多了META-INF这个文件夹,里面包含了三个文件,MANIFEST.MF、CERT.SF、CERT.RSA。
那么这三个文件都是干嘛的?
首先介绍个概念,下面是维基百科中对JAR的解释中一段:
On the Java platform, a Manifest file is a specific file contained within a JAR archive.It is used to define extension and package-related data. It is a metadata file that contains name-value pairs organized in different sections. If a JAR file is intended to be used as an executable file, the manifest file specifies the main class of the application. The manifest file is named MANIFEST.MF.
java平台中,在JAR包中有一个Manifest文件,它用来描述JAR包的Metadata。什么是Metadata?Metadata is “data about data”,又称元数据、中介数据、中继数据,为描述数据的数据,主要是描述数据属性的信息,用来支持如指示存储位置、历史数据、资源查找、文件纪录等功能。而这个Manifest中包含了的都是name:value这种的类似hashMap的键值对。这个manifest被命名为MANIFEST.MF。
apk中的这个MANIFEST.MF,列出了apk的所有文件,以及这些文件内容所对应的base64-encoded SHA1 哈希值,例如,
Name: AndroidManifest.xml
SHA1-Digest: 7lLs5fV2H4ttapcDEdtJRTQOzpk=
上述表示AndroidManifest.xml这个文件的SHA1的哈希值为7lLs5fV2H4ttapcDEdtJRTQOzpk=
CERT.SF和MANIFEST.MF很相似,但是它描述的不是文件内容的hash值,而是列出了MANIFEST.MF这个文件中每条信息的hash值,举例会明白些:
Name: AndroidManifest.xml
SHA1-Digest-Manifest: 8CVc0D8U2qQKRD+7Fw7+Jmb6Qos=
上面这条hash值‘8CVc0D8U2qQKRD+7Fw7+Jmb6Qos=’对应的是MANIFEST.MF中下面这几行字符串的hash值,明白了吗?hash函数的输入是下面的字符串。
Name: AndroidManifest.xml
SHA1-Digest: 7lLs5fV2H4ttapcDEdtJRTQOzpk=
注:计算SHA1-Digest-Manifest时,输入的字符串是三行,还要包括一行空白行,即’\r\n’。
这个文件里面其实包含了对CERT.SF文件的数字签名以及签名时所用的platform.x509.pem这个数字证书(可以参考下节中对SignApk程序的分析),
可以利用keytool和openssl工具进行读取相关信息,但是输出结果不同,首先利用keytool读取
keytool -printcert -file ./CERT.RSA
Owner: [email protected], CN=Android, OU=Android, O=Android, L=Mountain View, ST=California, C=US
Issuer: [email protected], CN=Android, OU=Android, O=Android, L=Mountain View, ST=California, C=US
Serial number: b3998086d056cffa
Valid from: Wed Apr 16 06:40:50 CST 2008 until: Sun Sep 02 06:40:50 CST 2035
Certificate fingerprints:
MD5: 8D:DB:34:2F:2D:A5:40:84:02:D7:56:8A:F2:1E:29:F9
SHA1: 27:19:6E:38:6B:87:5E:76:AD:F7:00:E7:EA:84:E4:C6:EE:E3:3D:FA
SHA256: C8:A2:E9:BC:CF:59:7C:2F:B6:DC:66:BE:E2:93:FC:13:F2:FC:47:EC:77:BC:6B:2B:0D:52:C1:1F:51:19:2A:B8
Signature algorithm name: MD5withRSA
Version: 3
Extensions:
#1: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
0000: 4F E4 A0 B3 DD 9C BA 29 F7 1D 72 87 C4 E7 C3 8F O......)..r.....
0010: 20 86 C2 99 ...
]
[[email protected], CN=Android, OU=Android, O=Android, L=Mountain View, ST=California, C=US]
SerialNumber: [ b3998086 d056cffa]
]
#2: ObjectId: 2.5.29.19 Criticality=false
BasicConstraints:[
CA:true
PathLen:2147483647
]
#3: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 4F E4 A0 B3 DD 9C BA 29 F7 1D 72 87 C4 E7 C3 8F O......)..r.....
0010: 20 86 C2 99 ...
]
]
在上面的输出中,有个Certificate fingerprints,开始以为这就是公钥的fingerprints,其实并不是,而是签名apk时的platform.x509.pem数字证书所对应的hash值。
Certificate fingerprints:
MD5: 8D:DB:34:2F:2D:A5:40:84:02:D7:56:8A:F2:1E:29:F9
SHA1: 27:19:6E:38:6B:87:5E:76:AD:F7:00:E7:EA:84:E4:C6:EE:E3:3D:FA
SHA256: C8:A2:E9:BC:CF:59:7C:2F:B6:DC:66:BE:E2:93:FC:13:F2:FC:47:EC:77:BC:6B:2B:0D:52:C1:1F:51:19:2A:B8
Signature algorithm name: MD5withRSA
Version: 3
如何确定的呢?platform.x509.pem为pem格式,首先用openssl将其转换为DER格式,执行
openssl x509 -in platform.x509.pem -outform DER -out cert.cer
然后对结果cert.cer执行sha哈希
sha1sum cert.cer
打印的结果即为,和上文Certificate fingerprints中的SHA1完全一样。
27196e386b875e76adf700e7ea84e4c6eee33dfa cert.cer
那么Certificate fingerprints中这些hash值有什么用?
在我理解,这些hash值不是数字证书内容的一部分,而是通过数字证书文件本身计算得出,主要作用是将apk和相应签名时的数字证书对应起来。
那么公钥等数字证书的信息在哪?我们继续用openssl命令查看证书相关信息:
openssl pkcs7 -inform DER -in CERT.RSA -noout -print_certs -text
输入结果为:
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 12941516320735154170 (0xb3998086d056cffa)
Signature Algorithm: md5WithRSAEncryption
Issuer: C=US, ST=California, L=Mountain View, O=Android, OU=Android, CN=Android/[email protected]
Validity
Not Before: Apr 15 22:40:50 2008 GMT
Not After : Sep 1 22:40:50 2035 GMT
Subject: C=US, ST=California, L=Mountain View, O=Android, OU=Android, CN=Android/[email protected]
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Modulus:
00:9c:78:05:92:ac:0d:5d:38:1c:de:aa:65:ec:c8:
a6:00:6e:36:48:0c:6d:72:07:b1:20:11:be:50:86:
3a:ab:e2:b5:5d:00:9a:df:71:46:d6:f2:20:22:80:
c7:cd:4d:7b:db:26:24:3b:8a:80:6c:26:b3:4b:13:
75:23:a4:92:68:22:49:04:dc:01:49:3e:7c:0a:cf:
1a:05:c8:74:f6:9b:03:7b:60:30:9d:90:74:d2:42:
80:e1:6b:ad:2a:87:34:36:19:51:ea:f7:2a:48:2d:
09:b2:04:b1:87:5e:12:ac:98:c1:aa:77:3d:68:00:
b9:ea:fd:e5:6d:58:be:d8:e8:da:16:f9:a3:60:09:
9c:37:a8:34:a6:df:ed:b7:b6:b4:4a:04:9e:07:a2:
69:fc:cf:2c:54:96:f2:cf:36:d6:4d:f9:0a:3b:8d:
8f:34:a3:ba:ab:4c:f5:33:71:ab:27:71:9b:3b:a5:
87:54:ad:0c:53:fc:14:e1:db:45:d5:1e:23:4f:bb:
e9:3c:9b:a4:ed:f9:ce:54:26:13:50:ec:53:56:07:
bf:69:a2:ff:4a:a0:7d:b5:f7:ea:20:0d:09:a6:c1:
b4:9e:21:40:2f:89:ed:11:90:89:3a:ab:5a:91:80:
f1:52:e8:2f:85:a4:57:53:cf:5f:c1:90:71:c5:ee:
c8:27
Exponent: 3 (0x3)
X509v3 extensions:
X509v3 Subject Key Identifier:
4F:E4:A0:B3:DD:9C:BA:29:F7:1D:72:87:C4:E7:C3:8F:20:86:C2:99
X509v3 Authority Key Identifier:
keyid:4F:E4:A0:B3:DD:9C:BA:29:F7:1D:72:87:C4:E7:C3:8F:20:86:C2:99
DirName:/C=US/ST=California/L=Mountain View/O=Android/OU=Android/CN=Android/[email protected]
serial:B3:99:80:86:D0:56:CF:FA
X509v3 Basic Constraints:
CA:TRUE
Signature Algorithm: md5WithRSAEncryption
57:25:51:b8:d9:3a:1f:73:de:0f:6d:46:9f:86:da:d6:70:14:
00:29:3c:88:a0:cd:7c:d7:78:b7:3d:af:cc:19:7f:ab:76:e6:
21:2e:56:c1:c7:61:cf:c4:2f:d7:33:de:52:c5:0a:e0:88:14:
ce:fc:0a:3b:5a:1a:43:46:05:4d:82:9f:1d:82:b4:2b:20:48:
bf:88:b5:d1:49:29:ef:85:f6:0e:dd:12:d7:2d:55:65:7e:22:
e3:e8:5d:04:c8:31:d6:13:d1:99:38:bb:89:82:24:7f:a3:21:
25:6b:a1:2d:1d:6a:8f:92:ea:1d:b1:c3:73:31:7b:a0:c0:37:
f0:d1:af:f6:45:ae:f2:24:97:9f:ba:6e:7a:14:bc:02:5c:71:
b9:81:38:ce:f3:dd:fc:05:96:17:cf:24:84:5c:f7:b4:0d:63:
82:f7:27:5e:d7:38:49:5a:b6:e5:93:1b:94:21:76:5c:49:1b:
72:fb:68:e0:80:db:db:58:c2:02:9d:34:7c:8b:32:8c:e4:3e:
f6:a8:b1:55:33:ed:fb:e9:89:bd:6a:48:dd:4b:20:2e:da:94:
c6:ab:8d:d5:b8:39:92:03:da:ae:2e:d4:46:23:2e:4f:e9:bd:
96:13:94:c6:30:0e:51:38:e3:cf:d2:85:e6:e4:e4:83:53:8c:
b8:b1:b3:57
从上面我们看到 Subject Public Key Info项,猜想这应该就是公钥的相关信息了吧。
下面是RFC3280描述的Certificate的数据结构
Certificate ::= SEQUENCE {
tbsCertificate TBSCertificate,
signatureAlgorithm AlgorithmIdentifier,
signatureValue BIT STRING }
TBSCertificate ::= SEQUENCE {
version [0] EXPLICIT Version DEFAULT v1,
serialNumber CertificateSerialNumber,
signature AlgorithmIdentifier,
issuer Name,
validity Validity,
subject Name,
subjectPublicKeyInfo SubjectPublicKeyInfo,
issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
extensions [3] EXPLICIT Extensions OPTIONAL
-- If present, version MUST be v3
}
Version ::= INTEGER { v1(0), v2(1), v3(2) }
CertificateSerialNumber ::= INTEGER
Validity ::= SEQUENCE {
notBefore Time,
notAfter Time }
Time ::= CHOICE {
utcTime UTCTime,
generalTime GeneralizedTime }
UniqueIdentifier ::= BIT STRING
SubjectPublicKeyInfo ::= SEQUENCE {
algorithm AlgorithmIdentifier,
subjectPublicKey BIT STRING }
Extensions ::= SEQUENCE SIZE (1..MAX) OF Extension
Housley, et. al. Standards Track [Page 15]
RFC 3280 Internet X.509 Public Key Infrastructure April 2002
Extension ::= SEQUENCE {
extnID OBJECT IDENTIFIER,
critical BOOLEAN DEFAULT FALSE,
extnValue OCTET STRING }
上面结构,完整的表现了一个certificate包含的所有信息,首先结构中是一个tbsCertificate 结构体,
tbsCertificate TBSCertificate
然后是对这个结构体,即tbsCertificate进行签名的算法和值,
signatureAlgorithm AlgorithmIdentifier,
signatureValue BIT STRING
如果签名的算法为SHA1 with RSA,则计算出一个SHA-1哈希值,然后利用apk的签发者的RSA private key对这个哈希值进行签名。当然这个签名和上述keytool展示的fingerprint没任何关系,因为它计算的只是TBSCertificate这部分的哈希值。
而在TBSCertificate结构中,我们看到了SubjectPublicKeyInfo 这个结构,它就代表公钥相关的算法以及公钥值。
SubjectPublicKeyInfo ::= SEQUENCE {
algorithm AlgorithmIdentifier,
subjectPublicKey BIT STRING }
同时利用以下命令读取签名时的数字证书,和上述CERT.RSA信息进行比对:
openssl x509 -inform pem -in platform.x509.pem -noout -text
结果如下,发现完全一样,也证实了上面的说法,CERT.RSA中包含了签名时的数字证书。
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 12941516320735154170 (0xb3998086d056cffa)
Signature Algorithm: md5WithRSAEncryption
Issuer: C=US, ST=California, L=Mountain View, O=Android, OU=Android, CN=Android/[email protected]
Validity
Not Before: Apr 15 22:40:50 2008 GMT
Not After : Sep 1 22:40:50 2035 GMT
Subject: C=US, ST=California, L=Mountain View, O=Android, OU=Android, CN=Android/[email protected]
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Modulus:
00:9c:78:05:92:ac:0d:5d:38:1c:de:aa:65:ec:c8:
a6:00:6e:36:48:0c:6d:72:07:b1:20:11:be:50:86:
3a:ab:e2:b5:5d:00:9a:df:71:46:d6:f2:20:22:80:
c7:cd:4d:7b:db:26:24:3b:8a:80:6c:26:b3:4b:13:
75:23:a4:92:68:22:49:04:dc:01:49:3e:7c:0a:cf:
1a:05:c8:74:f6:9b:03:7b:60:30:9d:90:74:d2:42:
80:e1:6b:ad:2a:87:34:36:19:51:ea:f7:2a:48:2d:
09:b2:04:b1:87:5e:12:ac:98:c1:aa:77:3d:68:00:
b9:ea:fd:e5:6d:58:be:d8:e8:da:16:f9:a3:60:09:
9c:37:a8:34:a6:df:ed:b7:b6:b4:4a:04:9e:07:a2:
69:fc:cf:2c:54:96:f2:cf:36:d6:4d:f9:0a:3b:8d:
8f:34:a3:ba:ab:4c:f5:33:71:ab:27:71:9b:3b:a5:
87:54:ad:0c:53:fc:14:e1:db:45:d5:1e:23:4f:bb:
e9:3c:9b:a4:ed:f9:ce:54:26:13:50:ec:53:56:07:
bf:69:a2:ff:4a:a0:7d:b5:f7:ea:20:0d:09:a6:c1:
b4:9e:21:40:2f:89:ed:11:90:89:3a:ab:5a:91:80:
f1:52:e8:2f:85:a4:57:53:cf:5f:c1:90:71:c5:ee:
c8:27
Exponent: 3 (0x3)
X509v3 extensions:
X509v3 Subject Key Identifier:
4F:E4:A0:B3:DD:9C:BA:29:F7:1D:72:87:C4:E7:C3:8F:20:86:C2:99
X509v3 Authority Key Identifier:
keyid:4F:E4:A0:B3:DD:9C:BA:29:F7:1D:72:87:C4:E7:C3:8F:20:86:C2:99
DirName:/C=US/ST=California/L=Mountain View/O=Android/OU=Android/CN=Android/[email protected]
serial:B3:99:80:86:D0:56:CF:FA
X509v3 Basic Constraints:
CA:TRUE
Signature Algorithm: md5WithRSAEncryption
57:25:51:b8:d9:3a:1f:73:de:0f:6d:46:9f:86:da:d6:70:14:
00:29:3c:88:a0:cd:7c:d7:78:b7:3d:af:cc:19:7f:ab:76:e6:
21:2e:56:c1:c7:61:cf:c4:2f:d7:33:de:52:c5:0a:e0:88:14:
ce:fc:0a:3b:5a:1a:43:46:05:4d:82:9f:1d:82:b4:2b:20:48:
bf:88:b5:d1:49:29:ef:85:f6:0e:dd:12:d7:2d:55:65:7e:22:
e3:e8:5d:04:c8:31:d6:13:d1:99:38:bb:89:82:24:7f:a3:21:
25:6b:a1:2d:1d:6a:8f:92:ea:1d:b1:c3:73:31:7b:a0:c0:37:
f0:d1:af:f6:45:ae:f2:24:97:9f:ba:6e:7a:14:bc:02:5c:71:
b9:81:38:ce:f3:dd:fc:05:96:17:cf:24:84:5c:f7:b4:0d:63:
82:f7:27:5e:d7:38:49:5a:b6:e5:93:1b:94:21:76:5c:49:1b:
72:fb:68:e0:80:db:db:58:c2:02:9d:34:7c:8b:32:8c:e4:3e:
f6:a8:b1:55:33:ed:fb:e9:89:bd:6a:48:dd:4b:20:2e:da:94:
c6:ab:8d:d5:b8:39:92:03:da:ae:2e:d4:46:23:2e:4f:e9:bd:
96:13:94:c6:30:0e:51:38:e3:cf:d2:85:e6:e4:e4:83:53:8c:
b8:b1:b3:57
上面说CERT.RSA中除了有数字证书,还有CERT.SF的数字签名,这个信息咋获取?虽然取出来没啥大用,只是在安装apk时对签名验证的时候才用到,但是对我们理解原理有帮助。android在安装apk的时候肯定要把这些信息读取出来,那么我们可以通过android的packagemanager获取。
getPackageManager().getPackageInfo(packageName,PackageManager.GET_SIGNATURES).signatures
下面是一个例子,利用CERT.RSA获取一个数字证书的实例:
InputStream in = new FileInputStream("CERT.RSA");
CertificateFactory factory = CertificateFactory.getInstance("X.509")
X509Certificate cert = (X509Certificate) factory.generateCertificate(in);
但是在实际使用时并不这么使用,因为apk一般是一个jar文件,可以利用JarFile来解析,然后循环 JarEntry,调用getCertifiates()方法获取数字证书,然后cast 为X509Certifiate,就能获得certificate的所有信息,而不用去先解压apk文件获取到CERT.RSA。