buildman: Add the option to download toolchains from kernel.org

The site at https://www.kernel.org/pub/tools/crosstool/ is a convenient
repository of toolchains which can be used for U-Boot. Add a feature to
download and install a toolchain for a selected architecture automatically.

It isn't clear how long this site will stay in the current place and
format, but we should be able to rely on bug reports if it changes.

Suggested-by: Marek VaĊĦut <marex@denx.de>
Suggested-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Simon Glass <sjg@chromium.org>
diff --git a/tools/buildman/README b/tools/buildman/README
index 849e6ca..cf7bf5c 100644
--- a/tools/buildman/README
+++ b/tools/buildman/README
@@ -173,9 +173,9 @@
 
 3. Make sure you have the require Python pre-requisites
 
-Buildman uses multiprocessing, Queue, shutil, StringIO and ConfigParser.
-These should normally be available, but if you get an error like this then
-you will need to obtain those modules:
+Buildman uses multiprocessing, Queue, shutil, StringIO, ConfigParser and
+urllib2. These should normally be available, but if you get an error like
+this then you will need to obtain those modules:
 
     ImportError: No module named multiprocessing
 
@@ -310,6 +310,47 @@
 be used (c88 and c99). This is a feature.
 
 
+5. Install new toolchains if needed
+
+You can download toolchains and update the [toolchain] section of the
+settings file to find them.
+
+To make this easier, buildman can automatically download and install
+toolchains from kernel.org. First list the available architectures:
+
+$ ./tools/buildman/buildman sandbox --fetch-arch list
+Checking: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.6.3/
+Checking: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.6.2/
+Checking: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.5.1/
+Checking: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.2.4/
+Available architectures: alpha am33_2.0 arm avr32 bfin cris crisv32 frv h8300
+hppa hppa64 i386 ia64 m32r m68k mips mips64 or32 powerpc powerpc64 s390x sh4
+sparc sparc64 tilegx x86_64 xtensa
+
+Then pick one and download it:
+
+$ ./tools/buildman/buildman sandbox --fetch-arch or32
+Checking: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.6.3/
+Checking: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.6.2/
+Checking: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.5.1/
+Downloading: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.5.1//x86_64-gcc-4.5.1-nolibc_or32-linux.tar.xz
+Unpacking to: /home/sjg/.buildman-toolchains
+Testing
+      - looking in '/home/sjg/.buildman-toolchains/gcc-4.5.1-nolibc/or32-linux/.'
+      - looking in '/home/sjg/.buildman-toolchains/gcc-4.5.1-nolibc/or32-linux/bin'
+         - found '/home/sjg/.buildman-toolchains/gcc-4.5.1-nolibc/or32-linux/bin/or32-linux-gcc'
+Tool chain test:  OK
+
+Buildman should now be set up to use your new toolchain.
+
+At the time of writing, U-Boot has these architectures:
+
+   arc, arm, avr32, blackfin, m68k, microblaze, mips, nds32, nios2, openrisc
+   powerpc, sandbox, sh, sparc, x86
+
+Of these, only arc, microblaze and nds32 are not available at kernel.org..
+
+
 How to run it
 =============
 
diff --git a/tools/buildman/bsettings.py b/tools/buildman/bsettings.py
index 9eb9b2b..b361469 100644
--- a/tools/buildman/bsettings.py
+++ b/tools/buildman/bsettings.py
@@ -43,3 +43,13 @@
         return []
     except:
         raise
+
+def SetItem(section, tag, value):
+    """Set an item and write it back to the settings file"""
+    global settings
+    global config_fname
+
+    settings.set(section, tag, value)
+    if config_fname is not None:
+        with open(config_fname, 'w') as fd:
+            settings.write(fd)
diff --git a/tools/buildman/cmdline.py b/tools/buildman/cmdline.py
index 6ad376d..e884e19 100644
--- a/tools/buildman/cmdline.py
+++ b/tools/buildman/cmdline.py
@@ -36,6 +36,10 @@
     parser.add_option('-F', '--force-build-failures', dest='force_build_failures',
           action='store_true', default=False,
           help='Force build of previously-failed build')
+    parser.add_option('--fetch-arch', type='string',
+          help="Fetch a toolchain for architecture FETCH_ARCH ('list' to list)."
+              ' You can also fetch several toolchains separate by comma, or'
+              " 'all' to download all")
     parser.add_option('-g', '--git', type='string',
           help='Git repo containing branch to build', default='.')
     parser.add_option('-G', '--config-file', type='string',
diff --git a/tools/buildman/control.py b/tools/buildman/control.py
index cd0333c..a7c5822 100644
--- a/tools/buildman/control.py
+++ b/tools/buildman/control.py
@@ -118,6 +118,22 @@
         print
         return 0
 
+    if options.fetch_arch:
+        if options.fetch_arch == 'list':
+            sorted_list = toolchains.ListArchs()
+            print 'Available architectures: %s\n' % ' '.join(sorted_list)
+            return 0
+        else:
+            fetch_arch = options.fetch_arch
+            if fetch_arch == 'all':
+                fetch_arch = ','.join(toolchains.ListArchs())
+                print 'Downloading toolchains: %s\n' % fetch_arch
+            for arch in fetch_arch.split(','):
+                ret = toolchains.FetchAndInstall(arch)
+                if ret:
+                    return ret
+            return 0
+
     # Work out how many commits to build. We want to build everything on the
     # branch. We also build the upstream commit as a control so we can see
     # problems introduced by the first commit on the branch.
diff --git a/tools/buildman/test.py b/tools/buildman/test.py
index 25be43f..c0ad5d0 100644
--- a/tools/buildman/test.py
+++ b/tools/buildman/test.py
@@ -409,5 +409,11 @@
         self.toolchains.Add('i386-linux-gcc', test=False)
         self.assertTrue(self.toolchains.Select('x86') != None)
 
+    def testToolchainDownload(self):
+        """Test that we can download toolchains"""
+        self.assertEqual('https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.6.3/x86_64-gcc-4.6.3-nolibc_arm-unknown-linux-gnueabi.tar.xz',
+            self.toolchains.LocateArchUrl('arm'))
+
+
 if __name__ == "__main__":
     unittest.main()
diff --git a/tools/buildman/toolchain.py b/tools/buildman/toolchain.py
index ad4df8c..d4c5d4a 100644
--- a/tools/buildman/toolchain.py
+++ b/tools/buildman/toolchain.py
@@ -5,11 +5,42 @@
 
 import re
 import glob
+from HTMLParser import HTMLParser
 import os
+import sys
+import tempfile
+import urllib2
 
 import bsettings
 import command
 
+# Simple class to collect links from a page
+class MyHTMLParser(HTMLParser):
+    def __init__(self, arch):
+        """Create a new parser
+
+        After the parser runs, self.links will be set to a list of the links
+        to .xz archives found in the page, and self.arch_link will be set to
+        the one for the given architecture (or None if not found).
+
+        Args:
+            arch: Architecture to search for
+        """
+        HTMLParser.__init__(self)
+        self.arch_link = None
+        self.links = []
+        self._match = '_%s-' % arch
+
+    def handle_starttag(self, tag, attrs):
+        if tag == 'a':
+            for tag, value in attrs:
+                if tag == 'href':
+                    if value and value.endswith('.xz'):
+                        self.links.append(value)
+                        if self._match in value:
+                            self.arch_link = value
+
+
 class Toolchain:
     """A single toolchain
 
@@ -20,7 +51,6 @@
         arch: Architecture of toolchain as determined from the first
                 component of the filename. E.g. arm-linux-gcc becomes arm
     """
-
     def __init__(self, fname, test, verbose=False):
         """Create a new toolchain object.
 
@@ -116,18 +146,29 @@
         self.paths = []
         self._make_flags = dict(bsettings.GetItems('make-flags'))
 
-    def GetSettings(self):
+    def GetPathList(self):
+        """Get a list of available toolchain paths
+
+        Returns:
+            List of strings, each a path to a toolchain mentioned in the
+            [toolchain] section of the settings file.
+        """
         toolchains = bsettings.GetItems('toolchain')
         if not toolchains:
             print ("Warning: No tool chains - please add a [toolchain] section"
                  " to your buildman config file %s. See README for details" %
                  bsettings.config_fname)
 
+        paths = []
         for name, value in toolchains:
             if '*' in value:
-                self.paths += glob.glob(value)
+                paths += glob.glob(value)
             else:
-                self.paths.append(value)
+                paths.append(value)
+        return paths
+
+    def GetSettings(self):
+      self.paths += self.GetPathList()
 
     def Add(self, fname, test=True, verbose=False):
         """Add a toolchain to our list
@@ -147,6 +188,24 @@
         if add_it:
             self.toolchains[toolchain.arch] = toolchain
 
+    def ScanPath(self, path, verbose):
+        """Scan a path for a valid toolchain
+
+        Args:
+            path: Path to scan
+            verbose: True to print out progress information
+        Returns:
+            Filename of C compiler if found, else None
+        """
+        for subdir in ['.', 'bin', 'usr/bin']:
+            dirname = os.path.join(path, subdir)
+            if verbose: print "      - looking in '%s'" % dirname
+            for fname in glob.glob(dirname + '/*gcc'):
+                if verbose: print "         - found '%s'" % fname
+                return fname
+        return None
+
+
     def Scan(self, verbose):
         """Scan for available toolchains and select the best for each arch.
 
@@ -160,12 +219,9 @@
         if verbose: print 'Scanning for tool chains'
         for path in self.paths:
             if verbose: print "   - scanning path '%s'" % path
-            for subdir in ['.', 'bin', 'usr/bin']:
-                dirname = os.path.join(path, subdir)
-                if verbose: print "      - looking in '%s'" % dirname
-                for fname in glob.glob(dirname + '/*gcc'):
-                    if verbose: print "         - found '%s'" % fname
-                    self.Add(fname, True, verbose)
+            fname = self.ScanPath(path, verbose)
+            if fname:
+                self.Add(fname, True, verbose)
 
     def List(self):
         """List out the selected toolchains for each architecture"""
@@ -264,3 +320,160 @@
             else:
                 i += 1
         return args
+
+    def LocateArchUrl(self, fetch_arch):
+        """Find a toolchain available online
+
+        Look in standard places for available toolchains. At present the
+        only standard place is at kernel.org.
+
+        Args:
+            arch: Architecture to look for, or 'list' for all
+        Returns:
+            If fetch_arch is 'list', a tuple:
+                Machine architecture (e.g. x86_64)
+                List of toolchains
+            else
+                URL containing this toolchain, if avaialble, else None
+        """
+        arch = command.OutputOneLine('uname', '-m')
+        base = 'https://www.kernel.org/pub/tools/crosstool/files/bin'
+        versions = ['4.6.3', '4.6.2', '4.5.1', '4.2.4']
+        links = []
+        for version in versions:
+            url = '%s/%s/%s/' % (base, arch, version)
+            print 'Checking: %s' % url
+            response = urllib2.urlopen(url)
+            html = response.read()
+            parser = MyHTMLParser(fetch_arch)
+            parser.feed(html)
+            if fetch_arch == 'list':
+                links += parser.links
+            elif parser.arch_link:
+                return url + parser.arch_link
+        if fetch_arch == 'list':
+            return arch, links
+        return None
+
+    def Download(self, url):
+        """Download a file to a temporary directory
+
+        Args:
+            url: URL to download
+        Returns:
+            Tuple:
+                Temporary directory name
+                Full path to the downloaded archive file in that directory,
+                    or None if there was an error while downloading
+        """
+        print "Downloading: %s" % url
+        leaf = url.split('/')[-1]
+        tmpdir = tempfile.mkdtemp('.buildman')
+        response = urllib2.urlopen(url)
+        fname = os.path.join(tmpdir, leaf)
+        fd = open(fname, 'wb')
+        meta = response.info()
+        size = int(meta.getheaders("Content-Length")[0])
+        done = 0
+        block_size = 1 << 16
+        status = ''
+
+        # Read the file in chunks and show progress as we go
+        while True:
+            buffer = response.read(block_size)
+            if not buffer:
+                print chr(8) * (len(status) + 1), '\r',
+                break
+
+            done += len(buffer)
+            fd.write(buffer)
+            status = r"%10d MiB  [%3d%%]" % (done / 1024 / 1024,
+                                             done * 100 / size)
+            status = status + chr(8) * (len(status) + 1)
+            print status,
+            sys.stdout.flush()
+        fd.close()
+        if done != size:
+            print 'Error, failed to download'
+            os.remove(fname)
+            fname = None
+        return tmpdir, fname
+
+    def Unpack(self, fname, dest):
+        """Unpack a tar file
+
+        Args:
+            fname: Filename to unpack
+            dest: Destination directory
+        Returns:
+            Directory name of the first entry in the archive, without the
+            trailing /
+        """
+        stdout = command.Output('tar', 'xvfJ', fname, '-C', dest)
+        return stdout.splitlines()[0][:-1]
+
+    def TestSettingsHasPath(self, path):
+        """Check if builmand will find this toolchain
+
+        Returns:
+            True if the path is in settings, False if not
+        """
+        paths = self.GetPathList()
+        return path in paths
+
+    def ListArchs(self):
+        """List architectures with available toolchains to download"""
+        host_arch, archives = self.LocateArchUrl('list')
+        re_arch = re.compile('[-a-z0-9.]*_([^-]*)-.*')
+        arch_set = set()
+        for archive in archives:
+            # Remove the host architecture from the start
+            arch = re_arch.match(archive[len(host_arch):])
+            if arch:
+                arch_set.add(arch.group(1))
+        return sorted(arch_set)
+
+    def FetchAndInstall(self, arch):
+        """Fetch and install a new toolchain
+
+        arch:
+            Architecture to fetch, or 'list' to list
+        """
+        # Fist get the URL for this architecture
+        url = self.LocateArchUrl(arch)
+        if not url:
+            print ("Cannot find toolchain for arch '%s' - use 'list' to list" %
+                   arch)
+            return 2
+        home = os.environ['HOME']
+        dest = os.path.join(home, '.buildman-toolchains')
+        if not os.path.exists(dest):
+            os.mkdir(dest)
+
+        # Download the tar file for this toolchain and unpack it
+        tmpdir, tarfile = self.Download(url)
+        if not tarfile:
+            return 1
+        print 'Unpacking to: %s' % dest,
+        sys.stdout.flush()
+        path = self.Unpack(tarfile, dest)
+        os.remove(tarfile)
+        os.rmdir(tmpdir)
+        print
+
+        # Check that the toolchain works
+        print 'Testing'
+        dirpath = os.path.join(dest, path)
+        compiler_fname = self.ScanPath(dirpath, True)
+        if not compiler_fname:
+            print 'Could not locate C compiler - fetch failed.'
+            return 1
+        toolchain = Toolchain(compiler_fname, True, True)
+
+        # Make sure that it will be found by buildman
+        if not self.TestSettingsHasPath(dirpath):
+            print ("Adding 'download' to config file '%s'" %
+                   bsettings.config_fname)
+            tools_dir = os.path.dirname(dirpath)
+            bsettings.SetItem('toolchain', 'download', '%s/*' % tools_dir)
+        return 0