ARM: Use CPU local lock before sending load to mem system.

This change uses the locked_mem.hh header to handle implementing CLREX. It
simplifies the current implementation greatly.
5 files changed