My Ramblings

Human intelligence is not interpretable. Why should machine's be?

While intelligence is hard to define exactly, one aspect of intelligence that we value is the ability to navigate complex problems. Complex systems don't have simple answers and don't have simple paths to answers. This means the machinery itself must be subtle and capable of taking narrowly different situations to vastly different results. Untangling any complex process is quite a monumental task even in the most amenable of circumstances. Human interpretability is usually just done through facial expression and words(external facing). While brain scans and lie detectors exist, they are fallible.

Human-human alignment is not achieved through interpretability. Human alignment is achieved through creating mutually beneficial systems. We have shared moral systems and laws, because a world without those systems or laws would be far more chaotic and painful to live in. Those ideals and laws also change as the fundamentals of the world that we're living change. The mutual benefit can come from our ability to generate value but also through power to enforce. The first is obviously more durable if possible.

Short-term, I do believe that better understanding the functioning of existing systems does deepen human knowledge, but as a long term solution to alignment, I don't think it's getting us anywhere. We'll need to create systems that are mutually beneficial to upkeep.

P.S. This idea came together in just a few minutes. Don't take anything that I write too seriously.