[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about running Xen on NVIDIA Jetson-TK1



Hi Julien,

On Mon, May 16, 2016 at 1:33 PM, Julien Grall <julien.grall@xxxxxxx> wrote:
> (CC Kyle who is also working on Tegra?)
>
> Hi Meng,
>
> Many people are working on Nvidia platform with different issues :/. I have
> CCed another person which IIRC is also working on it.

Sure. It's good to know others are also interested in this platform.
It will be more useful to fix it... :-)

>
> On 16/05/16 17:33, Meng Xu wrote:
>>
>> On Mon, May 16, 2016 at 7:33 AM, Julien Grall <julien.grall@xxxxxxx>
>> wrote:
>>>
>>>
>>> On 15/05/16 20:35, Meng Xu wrote:
>>>>
>>>>
>>>> I'm trying to run Xen on NVIDIA Jetson TK1 board. (Right now, Xen does
>>>> not support the Jetson board officially. But I'm thinking it may be
>>>> very interesting and useful to see it happens, since it has GPU inside
>>>> which is quite popular in automotive.)
>>>>
>>>> Now I encountered some problem to boot dom0 in Xen environment. I want
>>>> to debug the issues and maybe fix the issues, but I'm not so sure how
>>>> I should debug the issue more efficiently. I really appreciate it if
>>>> you advise me a little bit about the method of how to fix the issue.
>>>> :-)
>>>>
>>>> ---Below is the details----
>>>>
>>>> I noticed the Dushyant from IBM also tried to run Xen on the Jetson
>>>> board. (http://www.gossamer-threads.com/lists/xen/devel/422519). I
>>>> used the same Linux kernel (Jan Kiszka's development tree -
>>>> http://git.kiszka.org/linux.git/, branch queues/assorted) and Ian's
>>>> Xen repo. with the hack for Jetson board. I can see the dom0 kernel
>>>> can boot to some extend and then "stall/spin" before the dom0 kernel
>>>> fully boot up.
>>>>
>>>> In order to figure out the possible issue, I boot the exact same Linux
>>>> kernel in native environment on one CPU and collected the boot log
>>>> information in [1]. I also boot the same Linux kernel as dom0 in Xen
>>>> environment and collected the boot log information in [2].
>>>>
>>>> In Xen environment, dom0 hangs after the following message
>>>> [   10.541010] NET: Registered protocol family 10
>>>> 6mip6: Mobile IPv6
>>>> [   10.542510] mi
>>>>
>>>> In native environment, the kernel has the following log after
>>>> initializing NET.
>>>> [    2.934693] NET: Registered protocol family 10
>>>> [    2.940611] mip6: Mobile IPv6
>>>> [    2.943645] sit: IPv6 over IPv4 tunneling driver
>>>> [    2.951303] NET: Registered protocol family 17
>>>> [    2.955800] NET: Registered protocol family 15
>>>> [    2.960257] can: controller area network core (rev 20120528 abi 9)
>>>> [    2.966617] NET: Registered protocol family 29
>>>> [    2.971098] can: raw protocol (rev 20120528)
>>>> [    2.975384] can: broadcast manager protocol (rev 20120528 t)
>>>> [    2.981088] can: netlink gateway (rev 20130117) max_hops=1
>>>> [    2.986734] Bluetooth: RFCOMM socket layer initialized
>>>> [    2.991979] Bluetooth: RFCOMM ver 1.11
>>>> [    2.995757] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
>>>> [    3.001109] Bluetooth: BNEP socket layer initialized
>>>> [    3.006089] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
>>>> [    3.012052] Bluetooth: HIDP socket layer initialized
>>>> [    3.017894] Registering SWP/SWPB emulation handler
>>>> [    3.029675] tegra-pcie 1003000.pcie-controller: 2x1, 1x1
>>>> configuration
>>>> [    3.036586] +3.3V_SYS: supplied by +VDD_MUX
>>>> [    3.040857] +3.3V_LP0: supplied by +3.3V_SYS
>>>> [    3.045509] +1.35V_LP0(sd2): supplied by +5V_SYS
>>>> [    3.050201] +1.05V_RUN_AVDD: supplied by +1.35V_LP0(sd2)
>>>> [    3.057131] tegra-pcie 1003000.pcie-controller: probing port 0, using
>>>> 2 lanes
>>>> [    3.066479] tegra-pcie 1003000.pcie-controller: Slot present pin
>>>> change, signature: 00000008
>>>>
>>>> I'm suspecting that my dom0 kernel hangs when it tries to initialize
>>>> "can: controller area network core ". However, from Dushyant's post at
>>>> http://www.gossamer-threads.com/lists/xen/devel/422519,  it seems
>>>> Dushyant's dom0 kernel hangs when it tries to initialize pci_bus. (The
>>>> linux config I used may be different form Dushyant's. That could be
>>>> the reason.)
>>>>
>>>> Right now, the system just hangs and has no output indicating what the
>>>> problem could be. Although there are a lot of error message before the
>>>> system hangs, I'm not that sure if I should start with solving all of
>>>> those error messages. Maybe some errors can be ignored?
>>>>
>>>> My questions are:
>>>> 1) Do you have suggestion on how to see more information about the
>>>> reason why the dom0 hangs?
>>>
>>>
>>>
>>> Have you tried to dump the registers using Xen console (CTLR-x 3 times
>>> then 0) and see where it get stucks?
>>
>>
>>
>> I tried to type CTLR -x 3 times and then 0, nothing happens... :-(
>> Just to confirm, once the system got stuck, I directly type Ctrl-x for
>> three times on the host's screen. Am I correct?
>
>
> Sorry, I forgot the default way to switch console is CTLR-a three times.
>
> On my configuration I modified the default character to avoid issue with
> screen.
>

Ah-ha, typing "Ctrl -a, a" for three times, I can switch to the Xen
console now. :-)

Thank you for the clarification. :-)

>>
>> Maybe the serial console is not correctly set up?
>
>
> It's likely to be a problem with the serial drivers in driver. Can you tried
> CTLR-a before the kernel is booting?
>
>> The serial console configuration I used is as follows, could you have
>> a quick look to see if it's because I configure the serial
>> incorrectly?
>
>
> I am not familiar with the Nvidia board. If the serial configuration is
> working for baremetal, then it should work for Xen.
>
>>
>> I used screen program to attach to the serial port.
>> The command I used is $screen /dev/ttyUSB0 115200n8 on the host machine.
>>
>> On the board, I set up the device tree's /chosen node as follows:
>> #
>> fdt print /chosen
>>
>> chosen {
>>
>>          xen,xen-bootargs = "console=dtuart dtuart=serial0
>> dom0_mem=512M loglvl=all guest_loglvl=all dom0_max_vcpus=1
>> dom0_vcpus_pin maxcpus=1";
>>
>>          bootargs = "console=dtuart dtuart=serial0 dom0_mem=512M
>> loglvl=all guest_loglvl=all dom0_max_vcpus=1 dom0_vcpus_pin
>> maxcpus=1";
>
>
> FIY, this property is not necessary.
>
>>          module {
>>
>>                  bootargs = "console=hvc0 console=tty1 earlyprintk=xen
>> root=/dev/mmcblk0p1 rw rootwait";
>
>
> Can you try to add "clk_ignore_unused" on the Linux command line?

Yes. After trying it, it gives me some different,but more useful log
information.

In the kernel booting log, it first shows this message:

---start of the message----

[   10.607251] Waiting for root device /dev/mmcblk0p1...

... /* I omit some other unrelated message  */

[    5.347354] sdhci-tegra 700b0400.sdhci: Got WP GPIO

3mmc0: Unknown controller version (3). You may experience problems.

[    5.347464] mmc0: Unknown controller version (3). You may
experience problems.

sdhci-tegra 700b0400.sdhci: No vmmc regulator found

[    5.347647] sdhci-tegra 700b0400.sdhci: No vmmc regulator found

3mmc0: Unknown controller version (3). You may experience problems.

[    5.347933] mmc0: Unknown controller version (3). You may
experience problems.

sdhci-tegra 700b0600.sdhci: No vmmc regulator found

[    5.348099] sdhci-tegra 700b0600.sdhci: No vmmc regulator found

sdhci-tegra 700b0600.sdhci: No vqmmc regulator found

[    5.348162] sdhci-tegra 700b0600.sdhci: No vqmmc regulator found

4mmc0: Invalid maximum block size, assuming 512 bytes

[    5.348222] mmc0: Invalid maximum block size, assuming 512 bytes

6mmc0: SDHCI controller on 700b0600.sdhci [700b0600.sdhci] using ADMA 64-bit

[    5.395208] mmc0: SDHCI controller on 700b0600.sdhci
[700b0600.sdhci] using ADMA 64-bit

6usbcore: registered new interface driver usbhid

---end of the message----

It seems that mmc0 is not correctly recognized by dom0.

Later the dom0 kernel keeps printing this message:

[  426.405136] mmc0: Timeout waiting for hardware interrupt.

Dom0 actually hangs here, because it cannot read the eMMC device. :-(

Is it because the device tree is not properly recreated by Xen?
Do you happen to know how to fix this issue or have some idea about
how to fix it?  I can have a look at it.

BTW, dom0 didn't recognize the mmc1 controller either.

Thank you very much for your time and help in this! :-)

Best Regards,

Meng
-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.