It's really a long time since last post. Now I am working on the android mips porting project. I want to run android on the MIPS emulator.
The problem is that when I run mips-android on qemu, it hangs when executing init program in the initramfs root file-system. Then I use the remote gdb to debug the init and finds out that it it because pa_workspace is not initiated.
Function ashmem_create_region will open /dev/ashmem and return the fd if succeed. However, it returns -1 and the errno is 19 which means NO SUCH DEVICES.
fd = ashmem_create_region("system_properties", size);
The problem is who is responsible for creating /dev/ashmem?
In fact, in android it uses udev mechanism to create devices in /dev when executing function device_init. The full cold patch to create a device is as following:
device_init->coldboot->do_coldboot->write(fd, "add/n", 4)->handle_device_fd->handle_device_event->make_device
In function parse_event, it will parse the uevent msg and then pass uevent to handle_device_event. However, I find that the uevent message is a little weird. I use remote gdb to dump this message.
0x7ff3d250: "add@/class/tty/console"
0x7ff3d267: "ACTION=add"
0x7ff3d272: "DEVPATH=/class/tty/console"
0x7ff3d28d: "SUBSYSTEM=tty"
0x7ff3d29b: "MAJOR=+"
0x7ff3d2a3: "MINOR=/"
0x7ff3d2ab: "SEQNUM=31+"
0x7ff3d2b6: ""
0x7ff3d2b7: ""
0x7ff3d2b8: ""
You see, in the message the MAJOR is +. It will confuse the parse_event so that the corresponding device won't be created.
It looks the kernel passes wrong uevent message to user space. So the question is who has messed up the uevent message?
Then I recall that when booting linux kernel, there are some weird messages.
Primary instruction cache .kB, VIPT, 2-way, linesize 1* bytes.
Primary data cache .kB, 2-way, VIPT, no aliases, linesize 1* bytes
You see the instruction cache is .kB and linesize is 1* bytes, not a valid number at all.
Then I suspect that something is wrong in kernel when parsing the numbers. So I use the remote gdb to debug the kernel again.
r4k_cache_init->probe_pcache->printk->vprintk->vscnprintf->vsnprintf->number->put_dec->put_dec_trunc
Function put_dec_trunc uses a unsigned int between[0,99999] as input and outputs the number as a string. But I find that when the input is 2, the output is '.', not the expected '2'. So maybe this function is the bad boy.
In the following function I assume q=2.
277 static char* put_dec_trunc(char *buf, unsigned q)
278 {
279 unsigned d3, d2, d1, d0;
280 d1 = (q>>4) & 0xf; /*d1=0*/
281 d2 = (q>>8) & 0xf; /*d2=0*/
282 d3 = (q>>12); /*d3=0*/
283
284 d0 = 6*(d3 + d2 + d1) + (q & 0xf); /*d0=2*/
285 q = (d0 * 0xcd) >> 11; /*q=0*/
286 d0 = d0 - 10*q; /*d==2*/
287 *buf++ = d0 + ''; /* least significant digit */
288 d1 = q + 9*d3 + 5*d2 + d1; /*d1=0*/
289 if (d1 != 0) { /* it is false so we won't get into here.*/
290 q = (d1 * 0xcd) >> 11;
291 d1 = d1 - 10*q;
292 *buf++ = d1 + ''; /* next digit */
293
294 d2 = q + 2*d2;
295 if ((d2 != 0) || (d3 != 0)) {
296 q = (d2 * 0xd) >> 7;
297 d2 = d2 - 10*q;
298 *buf++ = d2 + ''; /* next digit */
299
300 d3 = q + 4*d3;
301 if (d3 != 0) {
302 q = (d3 * 0xcd) >> 11;
303 d3 = d3 - 10*q;
304 *buf++ = d3 + ''; /* next digit */
305 if (q != 0)
306 *buf++ = q + ''; /* most sign. digit */
307 }
308 }
309 }
310 return buf;
311 }
/* MIPS uses a0/a1 to pass arguments.
* a0= address of bu
* a1= q = 2
*/
8015a040 <put_dec_trunc>:
8015a040: 00051202 srl v0,a1,0x8 /* v0= q>>8*/
8015a044: 00053102 srl a2,a1,0x4 /* a2= q>>4*/
8015a048: 30c6000f andi a2,a2,0xf /* a2= (q>>4) & 0xf = d1 in line 280*/
8015a04c: 3048000f andi t0,v0,0xf /* t0 = (q>>8) & 0xf = d2 in line 281*/
8015a050: 00054b02 srl t1,a1,0xc /* t1= (q>>12) = d3 in line 282*/
8015a054: 00c81021 addu v0,a2,t0
8015a058: 00491021 addu v0,v0,t1 /* v0 = d1+d2+d3 in line 284*/
8015a05c: 24030006 li v1,6
8015a060: 70433802 mul a3,v0,v1 /*a3= 6*(d3 + d2 + d1)*/
8015a064: 30a5000f andi a1,a1,0xf /*a1= q & 0xf*/
8015a068: 24020009 li v0,9 /*v0=9*/
8015a06c: 00e56021 addu t4,a3,a1 /*t4= 6*(d3 + d2 + d1) + (q & 0xf) = d0 in line 284*/
8015a070: 240b00cd li t3,205 /*t3= 0xcd*/
8015a074: 71223802 mul a3,t1,v0 /*a3= 9*d3 in line 288. Apparently gcc has reordered the code.*/
8015a078: 718b1802 mul v1,t4,t3 /*v1= (d0*0xcd)*/
8015a07c: 24020005 li v0,5
8015a080: 00e62821 addu a1,a3,a2 /*a1= 9*d3 + d1 in line 288*/
8015a084: 71023002 mul a2,t0,v0 /*a2= 5*d2 in line 288*/
8015a088: 00031ac2 srl v1,v1,0xb /*v1= (d0*0xcd)>>11 in line 285*/
8015a08c: 240a000a li t2,10 /*t2=10*/
8015a090: 01800013 mtlo t4 /*put t4->lo. lo=t4= 6*(d3 + d2 + d1) + (q & 0xf) = d0 in line 284 */
8015a094: 706a0004 msub v1,t2 /*hilo = v1*t2 - hilo = 10*(q) - hilo in line 286*/
8015a098: 00c51021 addu v0,a2,a1
8015a09c: 00002812 mflo a1 /*lo->a1*/
8015a0a0: 00431821 addu v1,v0,v1
8015a0a4: 24a20030 addiu v0,a1,48
8015a0a8: 00803821 move a3,a0
8015a0ac: a0820000 sb v0,0(a0)
We just need to see the instruction in 0x8015a094, it is a msub instruction. The defination of msub is as following:
(HI,LO) = (HI,LO) - (GRP[RS]*GPR[RT])
Then after executing the instruction in 0x8015a094, the HI/LO should be 0/2. But qemu produces the value 0xffffffff/0xfffffffe, which is -2 indeed. Maybe this is the problem.
Then I need to find how qemu emulates msub instruction.
2178 case OPC_MSUB:
2179 {
2180 TCGv r_tmp1 = tcg_temp_new(TCG_TYPE_I64);
2181 TCGv r_tmp2 = tcg_temp_new(TCG_TYPE_I64);
2182 TCGv r_tmp3 = tcg_temp_new(TCG_TYPE_I64);
2183
2184 tcg_gen_ext32s_tl(t0, t0);
2185 tcg_gen_ext32s_tl(t1, t1);
2186 tcg_gen_ext_tl_i64(r_tmp1, t0);
2187 tcg_gen_ext_tl_i64(r_tmp2, t1);
2188 tcg_gen_mul_i64(r_tmp1, r_tmp1, r_tmp2); /*r_tmp1= gpr[rs]*gpr[rt] */
2189 gen_load_LO(t0, 0); /*t0 <- lo*/
2190 gen_load_HI(t1, 0); /*t1 <- hi*/
2191 tcg_gen_extu_tl_i64(r_tmp2, t0); /*r_tmp2 = 64bit expand of lo*/
2192 tcg_gen_extu_tl_i64(r_tmp3, t1); /*r_tmp3 = 64bit expand of hi*/
2193 tcg_gen_shli_i64(r_tmp3, r_tmp3, 32);
2194 tcg_gen_or_i64(r_tmp2, r_tmp2, r_tmp3);
2195 tcg_temp_free(r_tmp3);
2196 tcg_gen_sub_i64(r_tmp1, r_tmp1, r_tmp2); /*r_tmp1= r_tmp1 - r_tmp2 = gpr[rs]*gpr[rt] - HI/LO */
2197 tcg_temp_free(r_tmp2);
2198 tcg_gen_trunc_i64_tl(t0, r_tmp1);
2199 tcg_gen_shri_i64(r_tmp1, r_tmp1, 32);
2200 tcg_gen_trunc_i64_tl(t1, r_tmp1);
2201 tcg_temp_free(r_tmp1);
2202 tcg_gen_ext32s_tl(t0, t0);
2203 tcg_gen_ext32s_tl(t1, t1);
2204 gen_store_LO(t0, 0);
2205 gen_store_HI(t1, 0);
2206 }
2207 opn = "msub";
2208 break;
You see, qemu makes an error emulation of msub instruction. It uses gpr[rs]*gpr[rt]-HI/LO and then put the results to HI/LO, which is different from the defination of msub instruction. I patched the qemu code and it works.
BTW: MIPS32 4KTM Processor Core Family Software User’s Manual version MD00016 gives an error operation of msub instruction on papge 253.
Operation:
temp ← (HI || LO) - (GPR[rs] * GPR[rt])
HI ← temp63..32
LO ← temp31..0
Maybe this misleads the qemu developers. The latest qemu version has fixed this bug. We can see this instruction emulation in qemu-svn-20091014.
2195 case OPC_MSUB:
2196 {
2197 TCGv_i64 t2 = tcg_temp_new_i64();
2198 TCGv_i64 t3 = tcg_temp_new_i64();
2199
2200 tcg_gen_ext_tl_i64(t2, t0);
2201 tcg_gen_ext_tl_i64(t3, t1);
2202 tcg_gen_mul_i64(t2, t2, t3); /*t2=GPR[RS]*GPR[RT] */
2203 tcg_gen_concat_tl_i64(t3, cpu_LO[0], cpu_HI[0]); /*t3= HI/LO*/
2204 tcg_gen_sub_i64(t2, t3, t2); /*t2= HI/LO - GPR[RS]*GPR[RT] */
2205 tcg_temp_free_i64(t3);
2206 tcg_gen_trunc_i64_tl(t0, t2);
2207 tcg_gen_shri_i64(t2, t2, 32);
2208 tcg_gen_trunc_i64_tl(t1, t2);
2209 tcg_temp_free_i64(t2);
2210 tcg_gen_ext32s_tl(cpu_LO[0], t0);
2211 tcg_gen_ext32s_tl(cpu_HI[0], t1);
2212 }
2213 opn = "msub";
2214 break;
See this link for more information.
So finding this bug is really not easy. I have to dig dig and dig from userland to linux kernel and then to qemu until catching this bad qemu bug. Thanks to the remote gdb and gdb stub in qemu, it makes life easier.
Following is the patch of qemu.
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 3dded6c..0a1b461 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -2193,7 +2193,11 @@ static void gen_muldiv (DisasContext *ctx, uint32_t opc,
tcg_gen_shli_i64(r_tmp3, r_tmp3, 32);
tcg_gen_or_i64(r_tmp2, r_tmp2, r_tmp3);
tcg_temp_free(r_tmp3);
- tcg_gen_sub_i64(r_tmp1, r_tmp1, r_tmp2);
+ /* msub means HI/LO = HI/LO - GPR[RS]*GPR[RT],
+ * not HI/LO = GPR[RS]*GPR[RT] - HI/LO
+ */
+ //tcg_gen_sub_i64(r_tmp1, r_tmp2, r_tmp2);
+ tcg_gen_sub_i64(r_tmp1, r_tmp2, r_tmp1);
tcg_temp_free(r_tmp2);
tcg_gen_trunc_i64_tl(t0, r_tmp1);
tcg_gen_shri_i64(r_tmp1, r_tmp1, 32);