[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-ia64-devel] [PATCH] [RFC] [TAKE3] P2M/VP (incomplete) patches


  • To: "Isaku Yamahata" <yamahata@xxxxxxxxxxxxx>, <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Fri, 24 Mar 2006 23:11:38 +0800
  • Delivery-date: Fri, 24 Mar 2006 15:13:07 +0000
  • List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
  • Thread-index: AcZPJym5I4KRA7HeSTeu+Jwm5MkHngAKcbOw
  • Thread-topic: [Xen-ia64-devel] [PATCH] [RFC] [TAKE3] P2M/VP (incomplete) patches

>From: Isaku Yamahata
>Sent: 2006年3月24日 17:41
>
>
>Hello all xen/ia64 developers.
>The attached patches for xen-ia64-unstable.hg are the incomplete
>patches
>of P2M/VP model take 3.
>With these patches I can ssh to domU from a remote machine.
>These patches are incomplete yet, but grant table API clean up is
>planned.
>It should be discussed before actual coding.
>So I post take 3 patches to discuss on it.
>I will post a documentation for discussion by another mail.

Hi, Isaku,
        A quality writing and good work by far. Due to memory model as the 
most critical/basic component, your work is actually extended to cover 
many areas as the issues posted below. :-) Maybe you have to make a 
priority list, and see whether some core components can be split into 
self-contained parts with major cleanup efforts paid for them first.

        Some quick comments:

>- grant table API clean up
>  This is necessary for merging to xen/x86 upstream.
>  A documentation might be also needed.
>  - grant table read-only mapping

Do you mean the grant table itself presents as read-only to guest? X86 
version awards xenlinux to manage allocation/release of grant table 
entries.

>  - number of grant table entries

Yes, seems no reason to limit with one entry for IA64. :-)

>- grant table mapped page reference count and domain destruction
>  If a domain is destroyed when it maps a foreign domain's page
>  via grant table, its page reference count will be leaked.
>  Page freeing code is needed.

That's the area you can cooperate with Kan and Akio. Current domain 
destroy/page refcnt is weakly tested without your p2m feature in.

>
>- xen_start_info->{console_mfn, store_mfn}
>  These are defined as machine frame number. However on Xen/IA64
>with
>  P2M/VP model, these should be pseudo physical frame number.
>  So they should be gmfn instead of mfn.

Ha, take them to pseudo physical actually saves one hypercall used to 
retrieve mfn. 

>- unaligned access
>  When using network, (perhaps) dom0 complains with following
>messages.
>  kernel unaligned access to 0xe0000000189ec21e,
>ip=0xa0000001005644f1
>  kernel unaligned access to 0xe0000000189ec21e,
>ip=0xa000000100564590
>  kernel unaligned access to 0xe0000000189ec21e,
>ip=0xa000000100564630
>  On my environment, these ip's happened to be of Linux bridge.
>  I haven't tracked them down yet and I'm not very familiar a directory
>  linux/net/bridge. So I'm not sure that this is due to
>  the original Linux/IA64 or Xen/IA64 or the P2M/VP model patch.
>  Is there anyone who want to dig into this?

One example we encountered before is in qemu, where cmpxchgN is 
used on area unaligned with N.

>
>- sometimes Xen hangs with the following message.
>  (XEN) ia64_fault: General Exception: IA-64 Reserved Register/Field
>fault (data access): reflecting

Most time, such fault occurs at the place far from the very error point, with 
many faults nested already. Just one hint...:-)

>- copy_to_guest(), copy_from_guest()
>  They are broken.
>  Their copy may success or may result in EFAULT depending on tlb
>cache state.
>  Fortunately xen/PPC port already solved similar problems.

That's a potential issue for a long time, which seldom occurs previously  
because dom0 memory is contiguous at that time and thus large TLB 
entry like 16M can be injected into VHPT and machine TLB. However 
after transited to p2m model with indication for incontiguous memory, 
people get many smaller TLB entries (16k) and thus copy_to/from_guest 
is more likely to fail.

IA64 is a bit different as PPC, since xen/ia64 can walk guest virtual 
address directly while PPC can't. So normally people have two options:

- Injected faults into guest when failed, and then let guest re-execute 
hypercall. Cons is that forward progress may not be ensured when 
parameter buffer is huge. May need some transient data to track the 
progress.

- Do like xen/PPC way, to pass by machine physical address with 
scatter/gather list. Cons is that it makes worse when translation for guest 
buffer exists in mTLB and VHPT. (No poor man there)

Way to balance.

>- panic_domain()
>  This function is called when a domain behaves a way xen can't handle
>well.
>  A domain should be stopped, but xen should continue to run.
>  But the current implementation results in BUG() in xen or drops to
>debugger
>  so that xen itself stops.

Agree. It's better that Xen can survive to the maximum by separating 
domain specific issues to domain itself.

Thanks,
Kevin

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.