Learning Experience from BOL -> FOL in Coq

It’s possible in Coq!

Disclaimer: Dieser Thread wurde aus dem alten Forum importiert. Daher werden eventuell nicht alle Formatierungen richtig angezeigt. Der ursprüngliche Thread beginnt im zweiten Post dieses Threads.

Learning Experience from BOL → FOL in Coq
TL;DR: Defining formal languages with binders is very tricky in Coq. This is not an afternoon’s project.
EDIT: Fully formalizing BOL → FOL is indeed “easy” (with one idea) and possible, see my next post in this thread.

I managed to code something up in Coq for our Basic Ontology Language → FOL translation.

Usage: [code]Definition bolOntology := bolSignature * bolTheory. (* product type *)

(* … *)

Definition sampleOntology : bolOntology := (
[concept “lecturer”; individual “FR”;
concept “course”; individual “WuV”;
relation “teaches”; relation “helps with”],

[in_relation “FR” (relationUnion “teaches” “helps with”) “WuV”]

Compute (bolSemantics sampleOntology).[/code]

Output: = ([predicateSymbol "lecturer" 1; individualSymbol "FR"; predicateSymbol "course" 1; individualSymbol "WuV"; predicateSymbol "teaches" 2; predicateSymbol "helps with" 2], [folLamApp (folLamApp (folFormulaLam (fun x : FOLVariable => folFormulaLam (fun x0 : FOLVariable => folDisjunction (folLamApp (folLamApp (folFormulaSymbolRef "teaches") (fromVar x)) (fromVar x0)) (folLamApp (folLamApp (folFormulaSymbolRef "helps with") (fromVar x)) (fromVar x0))))) (symbolRef "FR")) (symbolRef "WuV")]) : FOLSystem

Problem: This is not a raw FOL formula since it contains Coq functions. But getting rid of them by implementing some computational semantics is trickier than thought.

More details

  • Formalizing HOAS (or even simpler: FOL with lambdas) is tricky since you cannot just define the following:

[code]Inductive Formula : Set :=
| folForall: …
| folImpl: …
| folLambda: (Term → Formula) → Formula

with Term : Set := …[/code] (allowing inductive constructors of the form of [m]folLambda[/m] makes CIC inconsistent)

Note that you don’t have this problem in LF!

  • You can circumvent this by making it [m]folLambda: (string → Formula) → Formula[/m], where we think of the string as a variable name.
  • But then for normalization you need to do [m]match … with | folApp (folLambda f) arg => substistute (f “z”) “z” arg[/m] except that “z” must always be a fresh variable name. At this point I gave up.
  • Nonetheless people have proved completeness of FOL in Coq, so there must be a way. Technically via De Brujin indices. Practically, you have to use libraries to make things easier. But the learning curve to use those libraries is high – I have been told.

Indeed, while Coq is a good language for this in principle, there are some subtleties.

The representation of names is generally a tricky problem.

  1. Defining theories (which introduce named constants, i.e., global, not alpha-renable names)

You’re presenting a deep encoding of names, i.e., BOL names are encoded as Coq data, namely strings.
You might alternatively try a shallow encoding of names, i.e., BOL names are encoded as Coq names (via modules or records).

General remark: shallow = represented as corresponding concept (names as names, types as types, formulas as formulas etc.); deep = represented as lower-level data (names as string, types as terms (of an appropriately defined inductive type), formulas as booleans)

A very nice property of an encoding is if well-formedness of the syntax can be encoded shallowly, i.e., to check if a BOL expression is well-formed, we can simply check if its encoding is well-formed as a Coq term.

That is often not possible, especially when names are introduced.
In that case, the representation of the syntax must include a family of functions (one function N_wellformed per non-terminal N) that take expressions (and possibly also a context) and return Booleans.
It return true for an iff the
The goal is then an encoded term t derived from N is well-formed if N_wellformed(t) returns true.
That will be necessary when representing BOL names as Coq strings.

  1. Representing binder (which introduce named variables, i.e., local, alpha-renamble names)

Coq, contrary to LF, does not support HOAS (a concept that we will probably not touch during this course but that is handled in KRMT).

But Coq, contrary to programming languages, is expressive enough to represent FOL-like formulas in its base language. (Programming languages fail at this because they cannot represent quantification.)
Therefore, when representing the semantics of BOL in Coq, we can by-pass FOL entirely: we just use Coq as the semantic language, in which we interpret BOL.
That eliminates the need to define FOL first.

Thanks for your comments!

If I understand you correctly, that would mean writing, for instance, the following in Coq:

[code]Axiom lecturer : Concept.
Axiom course : Concept.
Axiom teaches : Relation 2. (* binary relation *)

Axiom FR : lecturer.
Axiom WuV : course.
Axiom axiom1 : teaches FR WuV[/code]

While possible, it seems that we blur a bit the distinction between syntax and semantics. There is no explicit BOL-only syntax anymore. It’s mingled right away with semantics.

Indeed! But upon writing Sven for help, they quickly realized that we don’t even need full lambda support here. In fact [m]lambda: (folTerm → folFormula) → folFormula[/m] suffices for our purposes here, where the lambda’s argument just ranges over terms. (I had this version in my first post, too, but due to the useless mutual definition of the inductive types with [m]with[/m], Coq rejected it.)

I now fully formalized BOL → FOL in Coq such that it produces FOL signatures & theories as output:
(By contrast, in my first post the output was not pure FOL, but intermingled with Coq functions.)

[code]Require Import bol.
Require Import fol.
Require Import boltofol.

Definition sampleOntology : bolOntology :=
CON “professor” ;;
CON “lecturer” ;;
CON “course” ;;

REL "teaches" ;;
REL "taught-at" ;;
REL "responsible for" ;;
PROP "ects" TYPE bolInt ;;
PROP "hard" TYPE bolBool ;;

IND "FR" ;;
IND "WuV" ;;
IND "FAU" ;;

“WuV” HAS 5 OF “ects” ;;
“WuV” IS-NOT “hard” ;;

"WuV" <."taught-at".> "FAU" ;;

"FR" IS-A "lecturer" ;;
"FR" <."teaches" ; "taught-at".> "FAU" ;;  (* FR teaches at least one course at FAU *)
"lecturer" <<= ("course" forall "teaches") ;; (* lecturers only teach courses, nothing else *)


Compute (normalizeSystemFull (bolSemantics sampleOntology)).[/code]

I defined a lot of notations to make it look that pretty :slight_smile:
Florian, if you don’t mind, I’ll publish the source next week after the due date.

Is anyone up for collaborative creation of a BOL derivation calculus and proving correctness of BOL → FOL? :slight_smile: