bucephalus.org

Friday, January 21, 2011

ElephantMark

a simple PHP documentation tool with Markdown markup

Sure, (X)HTML is the standard format for documentations, and that is great. This holds even more for people writing programs. But then, writing or reading HTML code directly is a bit awkward and a couple of lightweight markup languages have emerged to make that easier. In the past, I often used Perls POD. But recently, I discovered Markdown, which is even more convenient and versatile as a general tool.

I think, Markdown is also very suitable as a format in program comments. For example, wrapping a piece of code ... into code tags <code>....</code> is achieved by simply placing backticks around it. Putting a block of code inside a <pre><code>....</code></pre> with converting special characters (&, <, > etc.) into HTML entities is simply done by indenting the lines with 4 spaces or 1 tab. All this is very intuitive and effortless.

With ElephantMark, I now have a tool that puts this idea into practice for PHP. ElephantMark is two things: It states three short and simple rules that turn ordinary PHP comments into Markdown text pieces. Secondly, it is a script elephantmark.php, that actually does this conversion and can be used like phpdoc.

ElephantMark is no competition for standard documentation tools. Professional programmers, building big libraries, will need smarter tools with features like automatic cross references etc. The idea of using Markdown in source code comments is probably more interesting for people, that need to write in many different programming languages and use PHP only occasionally. ElephantMark is understood in five minutes, there is no entire new documentation language to learn.

In its current version 1, elephantmark.php makes use of the Markdown-to-HTML converter markdown.php, written by Michel Fortin. This is a great tool itself, thank you very much! Currently, one needs both these scripts for ElephantMark conversions. But maybe, there is a way to merge them into one file for future versions.

So, here is the script, which is its own manual:

elephantmark.php

Thursday, March 25, 2010

Change Logic

As part of the reconstruction of the computability and intelligence concept in terms of propositional logic, I am currently working on an effective formalization of (causal) processes. I realized, that this is a goal which is very similar to certain branches of modal logic, but that the approach is very different. Although the work itself is still in a premature state, I thought it would be interesting to work out these differences and explain some core ideas. This resulted in a paper of five pages, that starts with the following preface:

Suppose, temporal logic is the subject that looks for a language and logic to reason about processes and things that change in time. Then it seems, that this implies a thorough study of time itself. But this is wrong. Time is a philosophical burden and dead weight in temporal logic. We shouldn't try to associate events to a time structure, we only need to realize change during the process. This paradigm shift is one starting point of a research project called change logic.

However, the elimination of time from temporal logic may not be so surprising as it tries to sound here. Actually, in the standard modal logical reconstruction, the relation to a time structure via a Kripke model is also only temporary. Once the formal system is motivated and its soundness and completeness is shown, time becomes superfluous here as well and disappears. In fact and in return, it is also possible to attach a linear time structure to what will be introduced as change logic.

So in the end, the real change with change logic does not so much come from a new philosophical semantics, but from the fact, that the whole thing was pulled off without adding new constructs to the syntax. In other words, change logic is temporal logic without modal operators.

The whole text is located here.

Thursday, February 4, 2010

"Literal Mathematics"

There seems to be a real revolution in mathematics on its way, namely the long due formal standardization.

This process is maybe best elaborated in this hierachy of three designs:

MathML, now in version 3.0 (December 2009), an XML application with the basic sytax in two modes: Presentation Markup and Content Markup. [http://www.w3.org/TR/MathML]

OpenMath, a standard for representing the semantics of mathematical objects. [http://www.openmath.org]

OMDoc, for the formalization of entire documents. [http://www.omdoc.org]

Sure, in our times there is an inflation of revolutions. But I think, this one is a really big one, despite the unexciting appearance of any norm.

A standard will provide us with a common language. A common language for mathematicians is more than a lingua franca or English as the world language. Formal scientist always need to create the world first, before they can talk about it. But at present, there is not even an agreement on whether "natural numbers" start from 0 or 1. This is not freedom, but pure inefficiency.

Currently, each constructive idea requires a choice for a concrete programming language when it is implemented. "Higher" languages have an interface concept that abstracts the signature from the details of the implementation. But there are no real standards for the translation between different languages. Each language needs to re-implement all the libraries to be a useful one. Of course, there is XML now, and that is a big step (MathML, OpenMath, and OMDoc are XML, too). But for mathematical structures and theories, XML itself is all too general.

This standardization of mathematics is not a new foundation in the meta-mathematical sense. It is a syntactic agreement, not a semantical super-theory. It doesn't explain, what a mathematical object really is, it only defines, how we define one. I suppose, this whole movement comes so late, because it was everything but obvious from the tradition, that a global standard does not need a global ontology. (And maybe in the end, that is a new foundation, after all.)

Donald Knuth introduced literal programming as an emphasis on the idea that documents should be comprehensive for humans and computers alike. Many programming languages offer "literal programming" tools, but none of them complies to the promise.

Unfortunately, at present there is still a shortage of tools to efficiently and comfortably work with these new formats.

Tuesday, February 2, 2010

PropLogic

In The propositional logic project, I describe my objections about the common state of this subject as follows:

Propositional logic sure has become a standard notion of our scientifically educated society, maybe as much as the arithmetic systems of integers, rational and real numbers. Boolean operations are standards in many areas of our computerized daily life, digital logic is the mathematical structure behind the computer hardware and information software.

But different to arithmetic systems and other basic data structures like lists, matrices or regular expressions, propositional algebras are not part of the standard tool repertoire of programming languages. This is certainly due to the cost explosion (see e.g. the boolean satisfiability problem) of its default implementation. What we need, is a fast implementation, which allows these structures to be used as basic tools for other programs.

The other problem with propositional logic is its classic algebraization as a (free) boolean algebra, which is only an abstraction of the semantic structure of propositional logic. That way, we loose some of the information. In other words, we need an algebraization that also preserves the syntactic structure of propositional worlds.

I am happy to announce the release of PropLogic, a Haskell package that intents to fix these problems and that might serve as a general and useful tool.

Despite my original intent to write a compact implementation for a pretty compact theory, this distribution is overloaded in an attempt to explain all its aspects. I suppose, the best place to start is A little program for propositional logic and a Brief introduction to PropLogic.

The first of these two tutorials doesn't require prior Haskell knowledge. Any fast implementation of a propositional algebra also provides a fast SAT solver and there is an interest and competition for the quickest solution. I have no idea how my program performs compare to other existing algorithms out there, but I tried to illustrate with some data how good it does the job. (I must admit however, that my "fast" program has its limits, too.)

The thing seems to work properly as it is, but I would still like to do some polishing and upload it to Hackage, soon. It would be very nice to get a boost from the comments and reactions of the Haskell community.

Friday, November 6, 2009

A fast SAT solver

A decade ago, I developed a system for propositional logic, based on Prime Normal Forms. The main function takes an arbitrary propositional formula φ and returns its prime conjunctive normal form pcnf(φ). Implicitly, this algorithm solves the SAT problem, i.e. it provides a general decision method for the question, if a given formula φ is satisfiable or not, namely:
φ is satisfiable iff pcnf(φ) is not (the normal form of) 0, i.e. "false".

Obviously, the SAT problem is one of the hot issues in computer science and there is a demand for a fast algorithm. However, it has never been the focus of my own research project, which is rather dealing with a re-interpretation of modern logic. I just needed a system that provided me with the functionality of propositional logic and at that time, I didn't know of any available one. I suggested however, that the solution I found would satisfy a general demand and that my approach tackled some very deep insights into the matter. I published the mathematical theory in a paper. I also wrote a Java applet that works like an online pocket calculator for propositional logic and accompanied it with a couple of tutorials and introductions for all kinds of users.

Sketch of the method

In my publications I rather use the dual as the default, i.e. I consider Prime Disjunctive Normal Forms, the function is pdnf instead of pcnf, and the satisfiability problem becomes the validity problem. The algorithm for the pdnf function is not stochastic or heuristic in nature, it is a strictly deterministic and algebraic procedure. I'll try to sketch its basic features, but let me recall some (more or less) standard terminology and well-known facts, first:

A literal λ is either an atomic or a negated atomic formula, i.e. α or ¬α.

A normal literal conjunction or NLC γ is a conjunction of literals [λ₁ ∧ ... ∧ λ_k] so that the atoms α₁, ..., α_k occuring in these literals are strict linearly ordered, according to some given linear order relation < on the chosen set of atoms. Each λ_i is a component of γ.

A disjunctive normal form or DNF Δ is a disjunction of NLC's [γ₁∨...∨γ_n]. Each γ_i is a component of Δ. We all know, that each formula φ has an equivalent DNF Δ, written φ⇔Δ.

Given a NLC γ=[λ₁ ∧ ... ∧ λ_k] and a DNF Δ=[γ₁∨...∨γ_n]. We say that
- γ is a factor of Δ, if γ implies (or is subvalent to) Δ, written γ⇒Δ.
- γ is a prime factor of Δ, if it is a factor and none of its components λ₁, ..., λ_k could be deleted without violating the subvalence γ⇒Δ.

A DNF Δ=[γ₁∨...∨γ_n] is called a
- prime DNF or PDNF, if the set of its components {γ₁, ..., γ_n} is exactly the set of all its prime factors.
- minimal DNF or MDNF, if there is no other equivalent DNF which is smaller in size. (The size of a DNF is the number of components and atom occurrences.)

Every propositional formula φ has an equivalent PDNF. This PDNF is unique (up to the order of its components). So the function pdnf that returns the equivalent PDNF pdnf(φ) for every given φ is a well-defined canonization of propositional logic.

Every φ also has an equivalent MDNF. But this MDNF is not unique in general. Is is however always a subset of the PDNF in the sense that each component of the MDNF must be a component of the PDNF.

Our goal is an implementation of the pdfn function, i.e. the construction of an equivalent PNDF Δ for each given formula φ. The real core of this function is the P-Procedure, which takes an arbitrary DNF Δ and returns the equivalent PDNF P-Procedure(Δ). A classical method to implement the P-Procedure is the Quine-McCluskey method. But that algorithm grows exponentially and is not feasible for other than small input DNF's. We need something else and we start with the idea of pairwise component minimalization and call this the M-Procedure:

We take two components γ_L and γ_R of the given Δ and replace it by the components of, i.e. the MDNF [μ₁∨...∨μ_m] of [γ_L∨γ_R]. Obviously, m is either 1 or 2, so this step can only decrease the size of Δ.

We repeat the first step until no more changes can be applied.

The resulting DNF, denoted by M-Procedure(Δ), is what we call a pairwise minimal DNF or M2DNF, i.e. a DNF where each pair of components make a minimal DNF. It is easy to proof that

each PDNF is a M2DNF, and

each MDNF is a M2DNF.

But none of these two facts holds the other way round. M-Procedure(Δ) is neither the prime nor a minimal form of Δ, at least not in general. The M-Procedure is not a realization of the P-Procedure (hence the two different names). But it will serve us well in a proper implementation of the P-Procedure...

I suppose, that most people who spent some time and concentration on the SAT problem have tried this approach of an M-Procedure. It is not a trivial matter to understand why this has to fail. The notion of prime in propositional logic is probably motivated by the according concept in number theory. But a closer investigation of things reveals a surprising and fundamental difference between prime factors in propositional formulas and integers. This problem, but also its solution, stems from the analysis of binary DNF's [γ_L∨γ_R].

For every two NLC's γ_L and γ_R we write

min(γ_L,γ_R) for the MDNF of [γ_L∨γ_R], and

prim(γ_L,γ_R) for the PDNF of [γ_L∨γ_R]

These functions min and prim have straight-forward implementations (of linear complexity) and they are not hard to explain. What is actually an interesting and crucial point here is the fact that

min(γ_L,γ_R) is made of either one or two components, as mentioned earlier,

prim(γ_L,γ_R) is often the same as min(γ_L,γ_R), but there is also a situation where min(γ_L,γ_R)=[γ_L∨γ_R] and prim(γ_L,γ_R)=[γ_L∨γ_R∨γ_c] is a 3-component DNF. For example, consider

prim([A ∧ B], [¬B ∧ C]) = [[A ∧ B] ∨ [¬B ∧ C] ∨ [A ∧ C]]

This third and new γ_c is what we call the c-prime.

Now we are able to implement the P-Procedure:


Algorithm P-Procedure(Δ)
begin
  Δ' := M-Procedure(Δ) ;
  repeat
    (1.) Δ'' := Δ' ;
    (2.) let Π be the set of all c-primes of component pairs
         in Δ' ;
    (3.) attach all the components of Π to Δ' ;
    (4.) Δ' := M-Procedure(Δ') ;
  until Δ' and Δ'' contain the same set of components ;
  return Δ' ;
end.

The proof for the correctness of this P-Procedure is based on a deep result of what I called Completeness Theorem, saying that a DNF is a PDNF iff it is a c-complete M2DNF.

For its computational complexity holds: If n is the number of different atoms in Δ, then the P-Procedure needs no more than n repeat loops. This, together with the fact that the M-Procedure is of polynomial complexity, let me suggest that the P-Procedure is of polynomial complexity as well. And that, of course, would have been a suprising answer to the open P=NP problem. When I realized that, I spent some time to find evidence for or against my conjecture, but I was only able to deliver some lemmata and partial proofs, but no definite decision.

Monday, October 12, 2009

My new communist card game

The homepage has undergone a complete makeover. Design is not really a goal in the first place, information is more important than aesthetics. However, dissatisfied with widespread features of mainstream design patterns, this latest version has some rather unconventional features:

There is one index page (the start/welcome page) and many single pages.

Each single page concentrates on its subject. It only has a link to the index page by default, instead of carrying a whole menu and framework around.

The index page is
- comprehensive: it shows all entire tables of content, there is no need to browse to further pages,
- compact: achieved by putting long content tables into scrollable cards,
- communistic: every item is an equal card in the whole game.

The latest fashion of many blogging frameworks (including this present one) by fixing the width of the page content destroys the advantages of HTML over the print formats (like PDF etc.), namely that it efficiently nestels into the browser window, be it a tiny smart phone or a huge cinema display. In particular, the cards that make the index page are supposed to nicely distribute inside the window.

Hopefully it works and you like it.

Thursday, October 1, 2009

Half a tutorial on the Haskell number system

Dear nice Haskell people out there!

Thank you for your friendly and numerous reactions on my number system picture on different web locations. It seems, that many people feel the same pain when it comes to numbers in Haskell. Even so brilliant introductions like the Real World Haskell seem to capitulate with this idiosyncratic complexity and rather sum up the facts. As I said, I gave up on it as well. But your reactions are itching.

So please, allow me to show you at least the existing half of my tutorial. The missing part is the actual reconstruction of the type classes. At some point, I tried to combine that part of Haskell with a reconstruction of the mathematical evolution from natural, to integer, ... to complex numbers. Here is a glimpse of what I had in mind. I thought, that many programmers could need this kind of update, which is necessary knowledge if one really wants to understand the logic behind the type class zoo.

Originally, this tutorial started off as just a section of an introduction to Haskell itself, some kind of "Haskell for mathematicians", with the ambition of being "the first truely functional introduction to this functional language". What I missed in all the classic texts is a pure conceptional or semantic approach to the matter. For example, they explain "if..then..else.." as a language construct that needs proper alignment etc etc. But "in fact" (i.e. in a functional brain), this is a function of type "(Bool,a,a)->a" (that accidentally happens to have a non-default syntax). In other words, instead of forcing people to learn the language first, before they can decide if they want to think that way, I thought I could start with the philosophy right away before going into the formal details. I cut out the part of the original introduction that attempts to sketch the Haskell universe the way I try to approach it.