Research Publication: My AI Idea
I have decided to just go ahead and publish my notes on AI. The idea isn't great for "current era" AI; it might be useful when quantum computers come out and NP problems can all be solved efficiently. Here are my notes:
$$$$$$Ai idea
4-5-2026
You could define the nouns and verbs in terms of substitutable words and phrases, along with the strength of the substitution.
A human user could be prompted with phrase for verb (eg) substitution possibilities and could rank them.
Even eg adjectival phrases could be defined in this way.
Also think of transposition concealment abbreviation substitution. We could also rate the extent of the meaning change of a phrase or sentence if one word or phrase were inserted or deleted.
Have a great specific rubric for how to do the ratings!
Consider both axiomatic and deduced rules and ratings.
The dictionary takes the form of direct substitution.
How do we deduce new rules?? That’s an open question.
Study Markov algorithms…aren’t they Turing-complete?
“ the answer to the question X is _____.” Use that to reason.
Note you could use existing AI to help with this AI.
3:37
One way to deduce rules: Take two existing “good and accurate and well-written passages” that are similar to each other (you could search the internet to find them). Then, deduce some substitution possibilities for the second passage based on the first passage using existing rules, which include both dictionary definitions and other established substitutions and strengths.
Should we include conditions for sub strength?
Eg, immediately after/before passage X, we can substitute passage Y for passage Z with strength V (a number).
5:15 p.m.
We could also define “passage types.” Then we could say, “After a passage of Passage Type A, the strength of (Passage / Passage Type) B being replaced with (Passage / Passage Type) C is equal to the rational number R.”
A passage type is just a collection of possible passages, which can include also other passage types. (No recursion allowed here.)
5:23 p.m.
From my blog:
One thing apple could build is a “verbose” music recommendation coach/system, that tells you why you might like a song. Apple could learn from conversations with users to continually improve the system.
5:33 p.m.
Could we produce automated fiction novels with this technology?
We can have “appropriate replacement” as the thing we’re focusing on.
We could have *universal* replacement, too. E.g., we could have a variable in a sentence, and just say, it could be anything, and the question is, what’s the thing that we can replace it with? What is the best word or phrase to replace a variable with, in terms of strength, if we don’t have any idea what the variable is?
Sometimes, replacement strengths won’t be 2-way, of course. I.e., you can replace X with Y with good strength, but not Y with X.
Also, be sure to keep in mind the “grammatical identity” of each word or phrase, i.e., it’s like the part of speech that we’re concerned with preserving when we substitute.
4-6-2026
Probably, the final AI program, for a given user, will be like a Markov algorithm…except that there will be other rules besides just strict substitution rules. In particular, there would also be “variable substitution rules,” concealment rules, conditional (predecessor text) substitution rules—which can be written as substitution rules, btw, in the Markov algorithm sense—and grammatical constraints, i.e., we would consider the “grammatical type” of each word/phrase, and would have separate rules for nouns, verbs, adjectives, subjects, predicates, direct objects, etc.
9:29 a.m.
You might use the strength of each rule as a way to order it, i.e., order the substitution and other rules in the Markov-algorithm-like algorithm by strength, the strangest rules come first.
The strength should reveal something about preference and indifference…strength = 0.5 means, we don’t care if we do the swap or not, we’re indifferent…0.75, e.g., means we would prefer to swap than not swap, and 1.0 is the maximum, where we definitely want to do this swap. 0.1, e.g., means we don’t really want to do the swap in this case.
The key to making this flexible enough to work is, having variable sentences, where the variable could be any word or phrase, with some constraints, such as part of speech or whether it’s a short phrase or just a word.
Also, in theory, we could have any suggested replacement have a strength rating…even if it’s not one we’ve considered.
If we have an “oracle” for replacement, i.e., oracle(string1, string2) = strength_rating, that would really help.
To devise this oracle, we could try searching the internet for each string. How would we assess the replaceability of one by the other? We could see, as a “distance,” how many replacements are needed from one passage to generate the other string.
Also, definitions will give us “high strength ratings,” I.e., it’s fine to replace a word with its synonym, or, with its definition, if it’s phrased right.
9:37 p.m.
Remember passage types. Passage types are key.
9:39 p.m.
Key things to keep in mind:
- passage types
- Markov algorithms
- word/phrase substitution rule strengths
- restricted substitution based on types of phrases/words—grammatical ID, like direct object, subject, verb, adjective, adverb
- variable words/phrases (i.e., we want to replace a word/phrase that could be anything with a specific word/phrase/passage type)
- the notion of a replacement oracle, that tells us the strength of the replacement for string1 being replaced by string2
- a very good idea: Just use a chat bot to do the replacement oracle. You can tell your account when its results are good/bad, in some cases where you know, to train it. Make sure your chatbot chats can’t be read by the company that runs the chatbot.
- be sure to add in extra rules, such as based on dictionaries and thesauruses…try to have a way to have your AI chatbot read that information in, i.e., share your existing preliminary substitution strength rules with the “oracle,” as given data before you query
- also, don’t forget about query/command/axiom
- again: load up the chatbot with all the rules you get from dictionaries and thesauruses, then, you can query it
- for the music program (which is safe), gate access to things unrelated to music, and, require that the bot be safe, i.e., don’t let the bot talk about extremist politics or suicide or religion…keep it restricted to music recommendation…write out the rules in English
- require that the chatbot not violate the rules…you can go ahead and talk to the chatbot in English to try to restrict its behavior
- build a bot of your own that writes queries to the chatbot you purchase good access to
- consider including link-based text ads within the bot to make money, in case no one buys your product…you could try to get music bands who want their songs to get more visibility to advertise with you
- I could maybe use Google AI…but I would have to be able to do long and an unlimited number of queries
- Make sure to use the conversations you have with humans as part of the service as fodder for rules…the idea is, each answer to a question is one big and lengthy substitution…a sentence for a sentence. Use AI to try to see how much more can be inferred beyond that.
- “Given (all the rules x1, x2, x3, … , x_k) and the passage, ‘The answer to <key question Q> is word_var1, word_var2, word_var3, word_var4.’ has good substitution rules for word_var1, word_var2, word_var, word_var4 being: (fill in the blank with the answer to the query)’”
- Restrict word_var1, etc., to be words from song titles.
- Do lots of background work, i.e., add to your list of rules in the database. After you have a lot of good, relevant rules, e.g., rules related to sentences making claims about songs, try to query the bot and see if it can get maximally strong substitution rules.
- Most of this is: building a database of rules about “relevant sentences” to the music described by the user, and, querying a chatbot. Not that much high-intricacy software used.
- You can have two modes for the music recommendation bot for the user: 1). Ratings only, 2). Describe the songs you like and give a little bit of detail about why you like them.
- My AI should parse a full dictionary and a full thesaurus before running.
- Also, include song lyrics for the different songs in the database, so that there’s a “handle” for the AI to grab on to
- Have two settings for the user and two settings for the bot—quiet, verbose—in both cases.
- Let the user rate each bot recommendation, from 1-10. Let the user rate the interaction verbally if they are set to verbose.
- Let users rate both the song recommendation and the accompanying passage of text, when the AI is in “verbose mode”
- For the ads: When a user searches for a song, try to recommend an advertised song, and write a short persuasive passage explaining why the advertised for song might be good for the user to listen to…try to make it *persuasive*, get the AI to manage it that way…try to frame the song so that the user will like it when he/she clicks on the link, assuming he/she does…try to measure and increase the odds that the link will be clicked, and that the user will like the advertised for song
04-06
12:17 p.m.
(Some of the items in the list above were added today, not last night.)
In case this doesn’t work—if regular AI capabilities aren’t good enough to solve the problem algorithmically—this could be mothballed and could wait until the company I sold to (or started) had access to quantum computers.
This idea is *better* than the Occam’s razor idea!! :-D
In particular, the Occam’s razor idea could be used on the “strength rating oracle” to make the oracle work well, and then the idea would be that this would be much stronger than just using the OR idea on regular natural-language sentences.
12:22 p.m.
How to “code” a word or phrase:
- list the part of speech
- list the substitution, concealment, and abbreviation rules, with a strength rating for each one
Note, in general, you’re going to just apply it like a Markov algorithm…just implement the strongest replacement rule. Also, in some cases, you’ll “mask” an actual word or phrase, and treat it as a “universal variable,” and then replace.
Recall that the sub rules can be based on passage types or on passages, i.e., words/phrases.
The general way to do it is, use the replacement strength algorithm, and, go like this:
“The answer to the question [insert question here], is, BLANK1, BLANK2, BLANK3, BLANK4, BLANK5.”
That gives you five words to answer the question.
A question: Can modern AI already do a sort-of good job at Occam’s razor problems? I guess: No, not really.
I don’t think the techniques I’m familiar with are the ones that are used to do modern AI.
12:48 p.m.
Someday, you could build a Markov algorithm based on Occam’s razor with QC. The Markov algorithm would be in place of the Turing machine in the OR setup. You could train the Markov algorithm with dictionary and other data. You might be able to generate some of the training data with regular current AI. That’s different from the current idea, which doesn’t use QC.
4:57 p.m.
A good way to summarize this: You could build a very “immediately applicable and useful” *English language Markov algorithm*. Take any passage, with or without blanks, and fill in the blanks with the perfect passage or phrase or word.
What you need, more than anything, is the *oracle* for the replacement function problem. It could be written as a decision problem, where the certificate is the thing you replace in, i.e., you substitute X for Y, and X is the certificate.
5:03 p.m.
I think a key to this is, in some cases, you will replace a passage with a *passage type*, and, in some cases you will replace a passage with a *universal blank*.
The final version of any sentence has to be without those things.
If you’re computing, you can restrict attention to only “(blanks/passage-type)-free” rules. It’s sort of like mathematical logic, where you can generate an open wf with modus ponens or universal generalization.
Think of it like this: Try to non-deterministically guess the perfect Markov algorithm of this nature. It *does* exist for the English language. It’s just that sometimes, you have to be creative about choosing *which “passage type” a phrase or string containing multiple passage type or universal variables is*, and, choosing to substitute something with a blank. Note, sometimes, the blanks will be indexed, e.g., x1, x2, x3, x1 again…sometimes, one blank will be the same as another blank.
So we define these things in the overall “nondeterministically guessed” program-thing:
- replacement rules
- “concealment” rule
- “abbreviation” rule
- passage type definition
- indexed universal blank definition
And remember, we restrict some things based on parts of speech, including, e.g., adjectival phrases, not just words.
5:29 p.m.
Could genetic algorithms help make this work?
6:00 p.m.
I think the most important and “smart/creative” thing is, to try to devise the *passage types* very carefully.
You might even allow recursive passage types, e.g., PASS_TYPE_X = {"abc”, “def”, “q” <concat> PASS_TYPE_X <concat> “r”}
6:39 p.m.
Basically, we are building a massive Markov algorithm, which is based entirely on values that can be obtained from the “replacement oracle.”
It might be too hard to write down the full Markov algorithm…so the idea is, the replacement oracle stands in for the Markov algorithm, different passage types are defined carefully, and then a series of different “strong replacements” are sort of “proposed,” and then we *approximately* run the Markov algorithm, by applying the “strongest proposed replacement” rule.
You could call it a pseudo-Markov-algorithm, or, an approximation of a Markov algorithm.
6:44 p.m.
One cool thing is, we only have to build the English algorithm once.
Also, if we have quantum computers, we can surely devise a good set of passage-types that work given enough training data…the problem is surely in NP, and we can build out the rest of the “replacement/other rules” as part of it, and then just discard them and keep the passage-types and re-generate other replacement/other rules using the
04-07
10:15 a.m.
How do you decide what should be a “passage type”? I think the answer is, based on what allows you to create productive substitution/other rules.
Perhaps we could infer good rule ideas and therefore good passage type ideas from passages of training text…*especially* question and answer. We could learn a lot from parsing the text from StackOverflow and StackExchange websites…the posts are even well-rated by human beings.
Once we have *some* passage types and rules, is there a way to generate more? Yes: We might be able to use existing rules to answer *some questions*…questions that we generate ourselves…and then we could use those questions and the answers we generate to further train the AI.
You could train the AI to be especially a “fill in the blanks” and “answer the question” AI.
10:27 a.m.
One thing you could do is, use existing AI to create brief summaries of the text. You could devise 50 different summaries of the question, and assign a passage type based on that, i.e., the passage type is anything that is equivalent to the 50 summaries.
You could summarize the answer, too. Then you could build a substitution rule where the question written as a statement, with “The answer to ____ is, _____”, has two wildcard universal blank variables, and a rule exists where we substitute in one passage type for the first blank and the other passage type for the second blank.
The problem is, this doesn’t totally give us the whole arrangement for the “replacement/other rule oracle.”
I think genetic algorithms might be helpful. Either that, or regular AI. Perhaps: Set up a problem where it could be solved by GA if GA is good enough, and then use the traditional AI tools that are available now to solve the problem.
The idea is, address deceptive fitness functions by pivoting from genetic algorithms to something like ChatGPT or Google’s AI.
You can still add in the other rules, based on question-answer and summaries, and dictionary definitions.
Hypothesis: Traditional ChatGPT-type AI is dominant over genetic algorithms, and can solve all problems that GA can solve, but just as well or better.
10:33 a.m.
The way to use GA is, you start by setting up a “full replacement oracle” algorithm. You add in things that the algorithm does—you add rules, and you never remove a rule that you add in. The algorithm is sort of like a Markov algorithm, but it has other rules besides strict substitution rules, and, the rules are not done in order exactly (although they could be)…we just select a rule that is “very strong” to use. We can choose a “strong enough rule” that we think will help us make progress, we don’t have to go strictly in order from strongest rule to less strong rule…although, we could, in theory anyway…maybe it would lead to problems.
Anyway, the idea is, we devise candidate new rules. We have the oracle algorithm running. We also have a large set of training data…questions and answers, and passages that can be re-written with equivalent meaning. The new candidate rule are devised based on crossover—an algorithm that combines two rules in some way—of existing rules that are already in the model. The fitness function is how closely a particular given new rule fits the set of training data. Each time we select a new “dominantly good rule,” we simply *add that rule to our mix*. If no rule is good enough, we try again. We can choose a small subset of the training data at random, and choose some of our rules at random.
Don’t worry about how the crossover function would work—we are not actually going to use crossover, we are going to use ChatGPT-style AI instead of crossover. The fitness function is easy to define. This is the approach to GA I was talking about before. It’s not literally going to be GA, but this setup is how you bring ChatGPT-type AI to bear on the problem.
11:55 a.m.
The key is, we want *targeted crossover*. I.e., the new “replacement/etc. rule” we obtain needs to begin with a certain sort of thing, something so that the replacement is *relevant* to the sentence we are manipulating with rules. We don’t need/want just any rules, we want very specific rules that can be applied directly to the “open wf” passage we have.
So you have to ask the AI to do crossover to form a particular “template passage,” e.g., you might say, “We begin with a sentence that is of the form ‘PassageTypeX concatenate blank1 concatenate jumps over concatenate blank2 concatenate PassageTypeY’”, and then the key is, we are going to replace that with something. We can generate different “initial things to sub for,” based on the passage, and then use AI to generate a proper replacement for the starting thing in the sentence.
Every rule *exists*, the question is, what is its *rating*. The challenge is, you have to use the existing rules you have to develop a rating for new rules. You might start by generating more arbitrary rules, and then after you have tons of wild miscellaneous rules, with ratings, start using them to crossover to form new, more targeted rules.
Again, the hardest thing is designing passage types. A passage type is literally just a collection of strings—or a more succinct representation of a collection of strings—that are “legally replaceable” back and forth with the passage-type.
When you “cross over” with AI, you can “nondeterministically guess” the passage types that you’re going to use. The challenge is, when you have a rule, and passage types are used, you want to make sure that the passage type that is defined for that rule is “very good” for making that rule stronger. You might be able to use AI to test out each passage type, and you might even be able to do it yourself…generate a passage that contains text with the passage type, more or less at random, and then try to assess if performing the substitution/etc. preserves the meaning of the new passage pretty well. You can *easily* test “similar meanings” of sentences with existing chatGPT-like AI :).
12:14 p.m.
The comment is, using “meaning comparison” AI, it should be easy to use AI to evaluate the *rating* of a selected rule. The hard part is to identify the rule. *****Identify lots of rules—like 1000, and target ones that would be useful—and then after the evaluation, take, say, the top 10 rules*****. Then you have those and you can use them to improve the passage. The goal of the rule is to replace a passage type or variable with a new passage/phrase/word that makes the overall passage simpler, while preserving the meaning rather well. *****Try to decrease the number of blank-variable items in the sentence.***** *****You generate “random relevant passages,” and evaluate the rules on those passages.*****
12:23 p.m.
Technically, although it’s *like* a Markov algorithm, it isn’t a Markov algorithm. *Every* rule exists in the “program,“ but the strengths of the rules are different…weak rules that don’t preserve meaning have lower strengths.
Don’t forget to use the dictionaries to help build some key rules in the program. We don’t want to consider every rule, but we do want to consider every dictionary rule.
Bold claim/hypothesis: *****The entire English language can be represented using this type of replacement/other rules program.*****
:) I think you can ask the AI to devise the passage types for you. Remember, start with a large group of existing rules—including the dictionary rules—and ask the AI to devise good passage type definitions and devise good, targeted rules with high ratings. You generate the ratings yourself, with your own computer program…the AI will try to devise good ones, and you will check them.
This sounds great! Don’t be a wuss or lazy…if no one wants to buy this, go ahead and code it yourself and build your “verbose music service”. :)
*****Remember to train on StackExchange/StackOverflow questions and answers.*****
12:55 p.m.
A question: Could a human theoretically do this process on his/her own, without using AI or even any computers?
12:57 p.m.
Rather than having the passage types “globally stored” in the large “program” for the English language, you could store each “passage type” as part of a rule, e.g., one rule might have 10 passage type definitions associated with it. It’s just a collection of strings.
A problem: If the blanks are taken to mean “anything”, any word or phrase, then replacing the blank with a particular word or phrase or passage type will *not* preserve the meaning of the overall sentence/passage. Instead, it will change the meaning to something more specific.
So the goal is *not* to preserve MEANING of the sentences/passages. It is to preserve the TRUTH of the sentences!
1:05 p.m.
Is a string replacement rule like a *fact*? =\. If so, we might be in trouble. Are we fact-finding, or truly rule-finding?
I think the key is, if it’s a good rule, it can be tested on randomly generated passages. So I think it is fine, even if it is “a fact” in each case, sort of.
I think this *is* a good idea. I’ll have to test it out someday if I want to be sure.
Probably, no one will want to buy this from me. But I could start the music startup with it.
1:09 p.m.
I think the key to how this works is, there is *a lot of “navigation” of NON-CLOSED-WFs, i.e., sentences that are not real sentences, but instead intermediate sentences that are “true” but not real sentences…they are gibberish passage-type and blank-variable sentences, and they are used as a gateway to true real sentences.*
The notion of “truth” is *extended* to beyond regular English language sentences…there are “garbage” open-wf sentences that are “true” as well, and they can be used to make deductions about real sentences.
1:20 p.m.
Comment: I believe this idea could be shown to *definitely work* if we had a BQP SAT solver. Verify and prove that later!!
1:30 p.m.
To prove it, we’d need to show that even thought it’s not exactly a Markov algorithm, the set of “huge programs” we could generate would be Turing-complete, i.e., every possible language’s truth, including the truth of the English language, can be captured with this system, given the right ratings for all the replacement/etc. rules.
Based on the initial sentence’s truth-strength and the rule-strength of the applied rule, a new sentence will be formed that will itself have a truth-strength.
It is OK if you can’t fully prove this. :).
1:34 p.m.
How do we rate a particular replacement/etc. rule’s strength??
I think: With trial and error. We randomly select a fitting “closed sentence” that is based on the initial open or closed sentence, and then we apply the rule, and then randomly select a fitting “closed sentence” based on the output open wf, and, we do this repeatedly and rate how true each sentence is, as far as we know based on ChatGPT’s guess. We are assuming that most sentences are “sort of easy to tell” if they are true or false. We try to rate the sentences’ truth as accurately as we can.
The problem is, that’s kind of recursive. If we already have a truth-evaluation program, why not just use that?
Maybe the argument is: We can *sometimes* evaluate if a sentence is true. If we don’t know, we omit, and try again with another random possibility. If the sentence is written ungrammatically, then of course it is false. That is why most rules are not very good. It might be hard to find good, productive rules.
1:43 p.m.
I think it is possible to prove that it works if we have QC :). I’m not sure exactly how though. We’re basically talking about “consistency.” We could argue that we start with a very “weak truth evaluator”, something based on ChatGPT perhaps, and then, we use this technique to build a stronger truth evaluator. Then, we use that to build an even stronger truth evaluator, etc., until we have a full truth evaluator and a full fill-in-the-blanks program.
2:00 p.m.
It should work provably with BQP…
- Is it possible for a massive (not fully written down) program for English truly capture truth in the English language completely?
- If we used Occam’s razor for this, i.e., tons of written down true sentences, would it work? (Yes, I think.)
2:08 p.m.
Yes, #2 works if #1 works. For sure…it’s just Occam’s razor. You just have to prove #1! I would argue, even if we have no “open wfs” included in the training data, it will still work. It’s an optimization problem. We can’t write down all of the rules, but we could write down *a program* that outputs all the rules! I.e., we could output the “replacement/etc. rules ‘oracle’” and that would tell us everything we need. It shouldn’t be all that many Turing states. Just envision a human brain with a little bit of extra oracular power to evaluate sentences based on “extreme genius” and high-computing power.
Worst-case scenario, this could be an excellent publication for how to get AI to work after quantum computers come out.
Is it possible to use AI just once—just use AI to do your best on the optimization problem, and solve for the “replacement/etc. rules oracle algorithm,” and then just use the best version of that that you can find?
Note, it could be that OpenAI has already used the Occam’s razor program to enhance its capabilities. I.e., they started with a weak AI. Then they put in the training data, 2 million true and false sentences or whatever, and used their capability to engineer a better Turing machine than it itself is. Then, they repeated the process repeatedly. That could be true! It could also be that they have been misleading about how the thing works, and actually, it’s just a very good GA algorithm that solved the Occam’s razor problem “well enough” at first.
2:32 p.m.
Note, we need two different algorithms:
- The replacements/etc. rule strength rating calculator (given the rule, output the strength).
- Given an English language sentence (or an open wf version), identify the strongest and most productive rule for eliminating at least 1 universal-blank-var or passage-type.
If we have access to BQP, we can do this fine, since the second one is better than the first one.
I might try to just publish this to my blog someday. It depends how much money I have, and whether or not I’m going to be able to help Africa without just using money.
8:08 pm
You can use existing rules to use proof by contradiction logic to demonstrate a new rule’s weakness. Show that if you assume the new rule works, it, along with the other established rules, will generate absurd nonsense sentences.
Also, with Monte Carlo algorithms you should be able to do lots of trials and assess how often the new rule’s weakness produces a good true sentence. That’s basically the definition of a “good rule”—often generates good true sentences.
(Select *relevant* random passages to test on in the Monte Carlo algorithm.)
8:17
So in fact, testing rules might be easy.
Other than random guessing, how can we generate new rules to test that we think may have a high rating?
Strategy: capture: given a set of words in order find a smaller phrase/word that captures the same thing.
Strategy: use dictionary to substitute a synonymous phrase
Strategy: build a “strong” passage-type definition and use it in your rule for your passage
Philosophical question: what’s a strong useful passage-type definition?
8:27 p.m.
So, this is great! We have the “rule strength oracle” algorithm already, and it’s easy!
04-08
10:29 a.m.
One issue: If you just preserve truth, and not meaning, the new version of the sentence might say something that is true, but totally irrelevant, based on string replacements, to what the original sentence meant.
There is an easy way to handle this, though; just add restrictions to the sentence, so that certain words—e.g., if it’s “The answer to <blank1> is <blank2>”, don’t do any replacements of words in <blank1>—are not able to be replaced.
Problem: How do you evaluate the truth value of an “open wf” sentence with blanks?
10:36 a.m.
I think the comment is, this idea is good if and only if quantum computers can be used to make it work. It’s better than the basic Occam’s razor thing. Don’t use the “rule strength oracle” algorithm idea you had above. That is not good enough. You need to nondeterministically guess the whole thing.
Comments
Post a Comment