Self-Healing Computers for Damaged Spaceships

Article written: 27 Apr , 2008
Updated: 24 Dec , 2015
by

What happens when a robotic space probe breaks down millions of miles away from the nearest spacecraft engineer? If there is a software bug, engineers can sometimes correct the problem by uploading new commands, but what if the computer hardware fails? If the hardware is controlling something critical like the thrusters or communications system, there isn’t a lot mission control can do; the mission may be lost. Sometimes failed satellites can be recovered from orbit, but as there’s no interplanetary towing service for missions to Mars. Can anything be done for damaged computer systems far from home? The answer might lie in a project called “Scalable Self-Configurable Architecture for Reusable Space Systems”. But don’t worry, machines aren’t becoming self-aware, they’re just learning how to fix themselves…

When spacecraft malfunction on the way to their destinations, often there’s not a lot mission controllers can do. Of course, if they are within our reach (i.e. satellites in Earth orbit), there’s the possibility that they can be picked up by Space Shuttle crews or fixed in orbit. In 1984 for example, two malfunctioning satellites were picked up by Discovery on the STS-51A mission (pictured above). Both communications satellites had malfunctioning motors and couldn’t maintain their orbits. In 1993 Space Shuttle Endeavour (STS-61) carried out an orbital mirror-change on the Hubble Space Telescope. (Of course, there’s always the option that top secret dead spy satellites can be shot down too.)

Although both of the retrieve/repair mission examples above most likely involved mechanical failure, the same could have been done if their onboard computer systems failed (if it was worth the cost of an expensive manned repair mission). But what if one of the robotic missions beyond Earth orbit suffered a frustrating hardware malfunction? It needn’t be a huge error either (if it happened on Earth, the problem could probably be fixed quickly), but in space with no engineer present, this small error could spell doom for the mission.

So what’s the answer? Build a computer that can fix itself. It might sound like the Terminator 2 storyline, but researchers at the University of Arizona are investigating this possibility. NASA is funding the work and the Jet Propulsion Laboratory is taking them seriously.

Ali Akoglu (assistant professor in computer engineering) and his team are developing a hybrid hardware/software system that may be used by computers to heal themselves. The researchers are using Field Programmable Gate Arrays (FPGAs) to create self-healing processes at the chip-level.

FPGAs use a combination of hardware and software. Because some hardware functions are carried out at chip-level, the software acts as FPGA “firmware”. Firmware is a common computer term where specific software commands are embedded in a hardware device. Although the microprocessor processes firmware as it would any normal piece of software, this particular command is specific to that processor. In this respect, firmware mimics hardware processes. This is where Akoglu’s research comes in.

The researchers are in the second phase of the project called Scalable Self-Configurable Architecture for Reusable Space Systems (SCARS) and have set up five wireless networked units that could easily represent five cooperating rovers on Mars. When a hardware malfunction occurs, the networked “buddies” deal with the problem on two levels. First, the troubled unit attempts to repair the glitch at node level. By reconfiguring the firmware, the unit is effectively reconfiguring the circuit, bypassing the error. If it is unsuccessful, the unit’s buddies perform a back-up operation, reprogramming themselves to carry out the broken unit operations as well as their own. Unit-level intelligence is used in the first case, but should this fail, network-level intelligence is used. All the operations are performed automatically, there is no human intervention

This is some captivating research with far-reaching benefits. If computers could heal themselves at long-distance, millions of dollars would be saved. Also, the longevity of space missions may be extended. This research would also be valuable to future manned missions. Although the majority of computer issues can be fixed by astronauts, critical systems failures will occur; using a system such as SCARS could perform life-saving back up whilst the source of the problem is being found.

Source: UA News


13 Responses

  1. Timber says

    Ian,

    That is an extraordinary photo, I have not seen that before. Who is it at the end of the satellite and what is it he is actually doing? Is he using his thrusters to manuver the unit to Discovery? That had to have been a lonnnggg untethered manuver, his heartrate must have been off the chart

  2. Vanamonde says

    USA 193 was destroyed but not “shot down”. Pieces re-entered but piece are still orbiting. You cannot de-orbit with a kinetic energy missile – you can only create a cloud of little pieces, each on their own path. But the solid hydrazine (supposely, the reason for this exercise) was evaporated.

  3. Chuck Lam says

    What happened to parallel redundancy? Chips, software and simple hardware drives are cheap and reliable. Redundancy works. Let’s keep things simple. Simple equals reliability. Let’s sacrifice a few extra pounds of payload for reliability. Opps! I forgot for a moment, NASA is working on this “Scalable Self-Configurable Architecture for Reusable Space Systems.” project. Whew, what a mouth full! Oh well! At least we see where some of our tax dollars are wasted, err . . . invested.

  4. Astrofiend (Syd, Aust) says

    This is part of the reason I’m so excited about missions that will be heading out in the next 10-20 years. The greatly enhanced reliability at the circuit-level that this kind of redundancy will yield will be a great boon for future missions, possibly allowing for very robust spacecraft that could operate in more extreme environments for longer periods of time.

    But it’s research of the sort that the SCARS group are working on that is really exciting stuff. One can imagine scalable swarms of semi-autonomous rovers/vehicles/devices that could communicate with each other and with Earth to really go out and get the scientific observations done at a cracking pace. You could have a large number of specialized vehicles with very specific abilities working at the same time around the clock. With the ability to communicate status and objectives between themselves, mission planners could prioritize observations and objectives and let their little minions go out and get busy.

    The Earth sciences benefit from such systems too – I believe the concept has already been worked up for swarms of small underwater autonomous vehicles that can go out and plumb the depths of the oceans by themselves, uplink their observations on a regular basis, and communicate in a limited but coordinated manner with each other.

    Of course, some of these abilities are certainly a while off, but they are intriguing none-the-less. The possibilities for future development of such technology are a little exciting and a bit scary at the same time

  5. Astrofiend (Syd, Aust) says

    Vanamonde Says:
    April 27th, 2008 at 10:22 pm

    “USA 193 was destroyed but not “shot down”. Pieces re-entered but piece are still orbiting. You cannot de-orbit with a kinetic energy missile – you can only create a cloud of little pieces, each on their own path. ”

    I think the orbit of USA 193 was low enough so that atmospheric drag (though the atmosphere is tenuous at that altitude) is supposed to make the orbits of the small pieces decay in a fairly short time frame. I heard somewhere that it would be a matter of weeks until reentry for the vast majority of the debris, but I may be wrong…

    I know other satellites have been blown to pieces by the likes of China in high-altitude orbits though; the orbits of which certainly won’t decay for quite some time.

  6. cheech says

    why cant they just make a engineer robot with spare equipment? and a spare robot incase he breaks down

  7. RL says

    This kind of research is exactly why space exploration funding is benficial not only for the direct science involved but spinoffs that can be applied elsewhere. Self-healing electronics could have a huge number of applications to all sorts of products.

    And I can’t help but think to myself, that should this work be successful (it sounds incredibly complex to make practical but I think it will be done), that Mr Akoglu will have put “the ghost in the machine”.

  8. ioresult says

    They never changed the mirror inside Hubble. They added corrective optics.

  9. Aidan says

    Timber

    You asked “Who is it at the end of the satellite and what is it he is actually doing?”
    It looks like he’s holding an over sized steering wheel, I suppose he’s driving the satellite.

  10. Timber says

    Good answer Aidan 🙂

  11. someguy says

    HAL: “Just a moment. Just a moment. I’ve just picked up a fault in the AE-35 unit. It’s going to go 100% failure in 72 hours.”

    DAVE: “….”

    HAL: “Nvm i fixed it.”

  12. alphonso richardson says

    So this is a bit like chip-scale LANs/WANs, as opposed to redundant circuits, or I’m I missing some really obvious concept?

  13. Chuck Lam says

    To alphonso richardson, Nope! You’re not missing anything.

Comments are closed.