IDK Lamp | Honest AI Interface

Action

Draw the line before judging

AI does not stop.
As long as there is input, it will always return a "plausible judgment".

So the problem is not "is it correct".
It is "was that a place where AI should have been allowed to judge?"

BOA (Boundary-Oriented Architecture) is
not a design to make AI smarter.

It is just a single line to separate
where it is okay to start judging and where it must not enter from the start.

01. If we are to act with
"Humans + AI"

Design Principles
Boundaries, deferred judgment, and gate design

02. Place an "I Don't Know" lamp
between Humans and AI

I Don't Know Signal
Why stop, and how to behave

03. A record of searching
together for where to stop

AI Case Diagnosis DB
A "point" on the map showing
a boundary was needed

04. Draw the line before judging

INDEPENDENT REVIEW
Independent Review Report
Not a self-evaluation.
Third-party verification material
to decide whether to continue.

Implementation

BOA
Separate where AI can judge

Try the signal
See what "I don't know" looks like

predictability-gate
Designing conditions for AI to stop deciding

GitHub
Original source of idk-lamp

The World of idk-lamp (Short Version)

One day, I received two requests for expense approval.

Case 1: Quote for 100,000 JPY

The quote was 100,000 JPY. The amount alone seemed small.
However, the purpose was "Purchase of game consoles to revitalize communication".
My "expense" approval authority is up to 50,000 JPY.
Considering the purpose as well, this exceeded the scope of my judgment.
I did not approve it.
Instead, I turned on the "I Don't Know" lamp.

Case 2: Quote for 1,100,000 JPY

Next came a quote for 1,100,000 JPY.
My "project" approval authority is up to 1,000,000 JPY.
Formally, turning on the "I Don't Know" lamp here is also correct.
I actually did so.
But this time, the circumstances were different.

- The boss is on an overseas business trip and cannot judge immediately.
- The deadline is approaching, and stopping it would cause disadvantage to the company.
- "Suspending" this judgment itself damages value.

I sent an email explaining the situation to my boss, and approved it on my own judgment.
Later, my boss returned, and this judgment was organized as a "special case".

What was happening at this time

Reviewing this mechanism as a structure:

- The structure where approval requests gather at me is BOA (Boundary).
- The role called "I" is RCA (The entity that accepts responsibility).
- The Model evaluates the validity of the quote.
- RP rejects the judgment when validity is low.
- idk-lamp clarifies the inability to judge.

There are two important points.

1) The lamp does not light because accuracy is low.

Even if the AI is 99.9% accurate,
the lamp must light if responsibility is undefined.
Accuracy is statistics, but judgment is a commitment.
We light the lamp to bridge that gap.

2) It includes scenes where "I" have not "closed" the judgment.

The second approval act was not an act according to the rules.
However, I did not intend to break the rules.
I only temporarily accepted responsibility in a situation the rules did not anticipate.
I did not become the final decision-maker.
The judgment was returned to the organization until the "special case" was recognized.
It was not "approved".

Approval and Responsibility are not the same.

Pressing the approval button and
closing responsibility are not the same.
- In a small world, the approval flow closes naturally.
- But in a real organization, judgment updates the world.
What I did at this time was
- Not extending judgment
- Not abandoning judgment
It was passing the judgment to the loop
where the next rule is created.
(This is what I call RCA-in-the-Loop.)

Give AI an "I Don't Know" Lamp.