[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 2/2] iommu: add rmrr Xen command line option for misc rmrrs



On Tue, May 05, 2015 at 11:14:30AM +0100, Jan Beulich wrote:
> >>> On 28.04.15 at 01:50, <elena.ufimtseva@xxxxxxxxxx> wrote:
> > Format for rmrr Xen command line option:
> > rmrr=start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]
> > If grub used and multiple ranges are specified, ';' should be 
> > quoted/escaped,
> > refer to grub man for more information.
> 
> I think you mean grub2 here?
yes.
> 
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1170,6 +1170,18 @@ Specify the host reboot method.
> >  'efi' instructs Xen to reboot using the EFI reboot call (in EFI mode by
> >   default it will use that method first).
> >  
> > +### rmrr
> > +> '= 
> > start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]
> > +
> > +Define RMRRs units that are missing from ACPI table along with device
> > +they belong to and use them for 1:1 mapping. End addresses can be omitted
> > +and one page will be mapped. The ranges are inclusive when start and end
> > +are specified. Note: grub requires to escape or use quotations if special
> > +characters are used, namely ';', refer to the grub documentation if 
> > multiple
> > +ranges are specified.
> 
> Same here then.
> 
> > @@ -846,6 +847,72 @@ out:
> >      return ret;
> >  }
> >  
> > +static __initdata LIST_HEAD(misc_rmrr_units);
> > +
> > +static void __init add_misc_rmrr(void)
> 
> s/misc/aux/ maybe? Also I don't really understand what you need that
> list head for - the presently otherwise unused nr_rmrr allows you to
> know how many valid entries there are in rmrru[].

I think its a legacy list head, will check.

> 
> > +{
> > +    struct acpi_rmrr_unit *rmrrn;
> > +    struct misc_rmrr_unit *rmrru;
> > +    unsigned int dev;
> > +    unsigned int seg;
> > +
> > +    list_for_each_entry( rmrru, &misc_rmrr_units, list )
> > +    {
> > +        rmrrn = xmalloc(struct acpi_rmrr_unit);
> > +        if ( !rmrrn )
> > +            return;
> > +
> > +        rmrrn->scope.devices = xmalloc_array(typeof(*rmrrn->scope.devices),
> > +                                             rmrru->dev_count);
> > +        if ( !rmrrn->scope.devices )
> > +        {
> > +            xfree(rmrrn);
> > +            return;
> > +        }
> > +        if ( mfn_valid(rmrru->base_address << PAGE_SHIFT) ||
> > +             mfn_valid(rmrru->end_address << PAGE_SHIFT) )
> 
> Please use PFN_UP() and PFN_DOWN() as appropriate. Also, how
> large do we expect custom RMRRs to reasonably get? If no more
> than a few pages, I think it would be quite reasonable to check
> all involved pages here rather than just first and last one. (And
> even if there can be many pages here, leaving at least a comment
> saying that this could do with improvement would be nice.)

Hm, thats a hard part not having these regions documented.                      
But I have found this thread  https://lkml.org/lkml/2014/6/18/683.              
If I am reading it right, for GPU that region maybe as big as 1024M.            
For USB I think its smaller, but hard to prove without docs.                    
I think we should just limit it to some max amount per RMRR and document        
it. What you think?

> 
> > +        {
> > +            if ( iommu_verbose )
> > +                printk(XENLOG_ERR VTDPREFIX
> > +                       "Invalid RMRR range [%"PRIx64" - %"PRIx64"]\n",
> > +                       rmrru->base_address, rmrru->end_address);
> 
> I think this should be issued regardless of iommu_verbose.
Ok.
> 
> > +            xfree(rmrrn->scope.devices);
> > +            xfree(rmrrn);
> > +            return;
> > +        }
> > +
> > +        rmrrn->base_address = rmrru->base_address;
> > +        rmrrn->end_address = rmrru->end_address;
> > +        seg = rmrru->sbdf[0] >> 16;
> > +
> > +        for ( dev = 0; dev < rmrru->dev_count; dev++ )
> > +        {
> > +            rmrrn->scope.devices[dev] = rmrru->sbdf[dev];
> > +            seg = seg | (rmrru->sbdf[dev] >> 16);
> 
> |=
> 
> The loop starting at 0 there also is no need to set up seg to other
> than plain zero before the loop.
> 
> But what I'm worried about is invalid uses like
> 
> rmrr=<addr>=0001:bb:dd.f,0000:bb:dd.f
> 
> (as opposed to the valid
> 
> rmrr=<addr>=0001:bb:dd.f,bb:dd.f
> 
> ). On an earlier version I had specifically pointed out that dealing
> with an omitted vs a zero segment requires special care.

Yes, you did.                                                                   
This is just a very easy way to verify segments by making sure                  
they are all equal if multiple devices specified with minimum code              
change.                                                                         
But that can be done in other way as well.                                      
In your proposed second valid case, segment 0001 should be assumed for          
second device as its own segment is not specified? 

> 
> > +        }
> > +
> > +        if ( seg != (rmrru->sbdf[0] >> 16) )
> > +        {
> > +            if ( iommu_verbose )
> > +                printk(XENLOG_ERR VTDPREFIX
> > +                       "Segments are not equal for RMRR range  [%"PRIx64" 
> > - %"PRIx64"]\n",
> > +                       rmrru->base_address, rmrru->end_address);
> 
> Same here regarding iommu_verbose.
Ok.
> 
> > +            xfree(rmrrn->scope.devices);
> > +            xfree(rmrrn);
> > +            return;
> > +        }
> > +
> > +        rmrrn->segment = seg;
> > +        rmrrn->scope.devices_cnt = rmrru->dev_count;
> > +
> > +        if ( register_one_rmrr(rmrrn) )
> > +        {
> > +            xfree(rmrrn->scope.devices);
> > +            xfree(rmrrn);
> 
> Perhaps also worth an error message?
Sure.
> 
> > +static void __init parse_rmrr_param(const char *str)
> > +{
> > +    const char *s = str, *cur, *stmp;
> > +    unsigned int i = 0, seg, bus, dev, func;
> > +    u64 start, end;
> > +
> > +    do {
> > +        start = simple_strtoull(cur = s, &s, 0);
> > +        if ( cur == s )
> > +            break;
> > +
> > +        if ( *s == '-' )
> > +        {
> > +            end = simple_strtoull(cur = s + 1, &s, 0);
> > +            if ( cur == s )
> > +                break;
> > +        }
> > +        else
> > +            end = start;
> > +        if ( i < MAX_MISC_RMRR )
> > +        {
> > +            rmrru[i].base_address = start << PAGE_SHIFT;
> > +            rmrru[i].end_address = (end + 1) << PAGE_SHIFT;
> 
> Considering an earlier comment perhaps it would be better to simply
> store frame numbers here, and convert to addresses only in
> add_misc_rmrr()?
Yes, may work better with veryfing the entire range of addresses.               

> 
> > +            rmrru[i].dev_count = 0;
> > +        }
> > +        else
> > +            break;
> 
> Please invert the if() condition (allowing to do things without "else")
> and move the check up to the top of the loop (or even into the
> loop condition further down).
Ok.
> 
> > +
> > +        if ( *s != '=' )
> > +            continue;
> > +
> > +        do {
> > +            if ( rmrru[i].dev_count >= MAX_MISC_RMRR_DEV )
> > +                break;
> > +            if ( *s == ';' )
> > +                break;
> > +            stmp = parse_pci(s + 1, &seg, &bus, &dev, &func);
> > +            if ( !stmp )
> > +                break;
> > +            rmrru[i].sbdf[rmrru[i].dev_count] = PCI_BDF(bus, dev, func) | 
> > seg << 16;
> 
> Parentheses around the shift please.
> 
> > --- a/xen/drivers/passthrough/vtd/dmar.h
> > +++ b/xen/drivers/passthrough/vtd/dmar.h
> > @@ -132,4 +132,14 @@ void disable_pmr(struct iommu *iommu);
> >  int is_usb_device(u16 seg, u8 bus, u8 devfn);
> >  int is_igd_drhd(struct acpi_drhd_unit *drhd);
> >  
> > +/*RMRR units derived from command line rmrr option */
> > +#define MAX_MISC_RMRR_DEV 20
> > +struct misc_rmrr_unit {
> > +    struct list_head list;
> > +    u64    base_address;
> > +    u64    end_address;
> > +    u16    dev_count;
> > +    u32    sbdf[MAX_MISC_RMRR_DEV];
> > +};
> 
> This type is used in only one source file - no need to expose it here.
> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.