Rework the crash reporting in BL3-1 to use less stack

This patch reworks the crash reporting mechanism to further
optimise the stack and code size. The reporting makes use
of assembly console functions to avoid calling C Runtime
to report the CPU state. The crash buffer requirement is
reduced to 64 bytes with this implementation. The crash
buffer is now part of per-cpu data which makes retrieving
the crash buffer trivial.

Also now panic() will use crash reporting if
invoked from BL3-1.

Fixes ARM-software/tf-issues#199

Change-Id: I79d27a4524583d723483165dc40801f45e627da5
diff --git a/Makefile b/Makefile
index f8a7da8..fef89c2 100644
--- a/Makefile
+++ b/Makefile
@@ -92,9 +92,8 @@
 VERSION_STRING		:=	v${VERSION_MAJOR}.${VERSION_MINOR}(${BUILD_TYPE}):${BUILD_STRING}
 
 BL_COMMON_SOURCES	:=	common/bl_common.c			\
-				common/debug.c				\
 				common/tf_printf.c			\
-				common/aarch64/assert.S			\
+				common/aarch64/debug.S			\
 				lib/aarch64/cache_helpers.S		\
 				lib/aarch64/misc_helpers.S		\
 				lib/aarch64/xlat_helpers.c		\