Represent git-submodule as nested projects, take 2

(Previous submission of this change broke Android buildbot due to
 incorrect regular expression for parsing git-config output.  During
 investigation, we also found that Android, which pulls Chromium, has a
 workaround for Chromium's submodules; its manifest includes Chromium's
 submodules.  This new change, in addition to fixing the regex, also
 take this type of workarounds into consideration; it adds a new
 attribute that makes repo not fetch submodules unless submodules have a
 project element defined in the manifest, or this attribute is
 overridden by a parent project element or by the default element.)

We need a representation of git-submodule in repo; otherwise repo will
not sync submodules, and leave workspace in a broken state.  Of course
this will not be a problem if all projects are owned by the owner of the
manifest file, who may simply choose not to use git-submodule in all
projects.  However, this is not possible in practice because manifest
file owner is unlikely to own all upstream projects.

As git submodules are simply git repositories, it is natural to treat
them as plain repo projects that live inside a repo project.  That is,
we could use recursively declared projects to denote the is-submodule
relation of git repositories.

The behavior of repo remains the same to projects that do not have a
sub-project within.  As for parent projects, repo fetches them and their
sub-projects as normal projects, and then checks out subprojects at the
commit specified in parent's commit object.  The sub-project is fetched
at a path relative to parent project's working directory; so the path
specified in manifest file should match that of .gitmodules file.

If a submodule is not registered in repo manifest, repo will derive its
properties from itself and its parent project, which might not always be
correct.  In such cases, the subproject is called a derived subproject.

To a user, a sub-project is merely a git-submodule; so all tips of
working with a git-submodule apply here, too.  For example, you should
not run `repo sync` in a parent repository if its submodule is dirty.

Change-Id: I4b8344c1b9ccad2f58ad304573133e5d52e1faef
diff --git a/command.py b/command.py
index dc6052a..96d7848 100644
--- a/command.py
+++ b/command.py
@@ -100,7 +100,33 @@
     """
     raise NotImplementedError
 
-  def GetProjects(self, args, missing_ok=False):
+  def _ResetPathToProjectMap(self, projects):
+    self._by_path = dict((p.worktree, p) for p in projects)
+
+  def _UpdatePathToProjectMap(self, project):
+    self._by_path[project.worktree] = project
+
+  def _GetProjectByPath(self, path):
+    project = None
+    if os.path.exists(path):
+      oldpath = None
+      while path \
+        and path != oldpath \
+        and path != self.manifest.topdir:
+        try:
+          project = self._by_path[path]
+          break
+        except KeyError:
+          oldpath = path
+          path = os.path.dirname(path)
+    else:
+      try:
+        project = self._by_path[path]
+      except KeyError:
+        pass
+    return project
+
+  def GetProjects(self, args, missing_ok=False, submodules_ok=False):
     """A list of projects that match the arguments.
     """
     all_projects = self.manifest.projects
@@ -114,40 +140,37 @@
     groups = [x for x in re.split(r'[,\s]+', groups) if x]
 
     if not args:
-      for project in all_projects.values():
+      all_projects_list = all_projects.values()
+      derived_projects = {}
+      for project in all_projects_list:
+        if submodules_ok or project.sync_s:
+          derived_projects.update((p.name, p)
+                                  for p in project.GetDerivedSubprojects())
+      all_projects_list.extend(derived_projects.values())
+      for project in all_projects_list:
         if ((missing_ok or project.Exists) and
             project.MatchesGroups(groups)):
           result.append(project)
     else:
-      by_path = None
+      self._ResetPathToProjectMap(all_projects.values())
 
       for arg in args:
         project = all_projects.get(arg)
 
         if not project:
           path = os.path.abspath(arg).replace('\\', '/')
-
-          if not by_path:
-            by_path = dict()
-            for p in all_projects.values():
-              by_path[p.worktree] = p
+          project = self._GetProjectByPath(path)
 
-          if os.path.exists(path):
-            oldpath = None
-            while path \
-              and path != oldpath \
-              and path != self.manifest.topdir:
-              try:
-                project = by_path[path]
-                break
-              except KeyError:
-                oldpath = path
-                path = os.path.dirname(path)
-          else:
-            try:
-              project = by_path[path]
-            except KeyError:
-              pass
+          # If it's not a derived project, update path->project mapping and
+          # search again, as arg might actually point to a derived subproject.
+          if (project and not project.Derived and
+              (submodules_ok or project.sync_s)):
+            search_again = False
+            for subproject in project.GetDerivedSubprojects():
+              self._UpdatePathToProjectMap(subproject)
+              search_again = True
+            if search_again:
+              project = self._GetProjectByPath(path) or project
 
         if not project:
           raise NoSuchProjectError(arg)