Automating a Clojure Project's File Layout

Saturday, August 03, 2013

I have come to accept the file's place in application development. In Leiningen based Clojure projects, a clean project layout has a file per namespace layout, and a directory per parent namespace. But, as a developer, I write functions, not files. The files are just artifacts. The creation and organization of code files is so systematic, it could be automated with something that maps functions/declarations into the hierarchical file structure used by project tooling.

Proof of concept: (https://gist.github.com/Jared314/6144582)

The idea behind this “object-hierarchical mapper” is to abstract out the project file layout from the functions and namespaces. There is too much tooling around files to replace them as a code storage format, but there is very little reason to think about file organization.

This code works by passing the function declarations (thing1, thing2, and thing3), with their corresponding namespace declarations (tester1.core, tester1.server), to the parse function which builds a tree whose nodes correspond to the namespace hierarchy. That tree (nested hash-maps) is then written out to the file system, using the write! function, into a valid project structure. The current issues center around optimizations, and messy code. In the generated files, the namespace declarations are not combined efficiently, and the functions are preceded by a list of declares to prevent reference ordering issues.

This is not a new concept. I would guess the idea extends as far back as the 1960s, because that really appears to be the case with most ideas in CS. I just remember the first time I was amazed by the idea after watching the Code Bubbles demo. More recently, Light Table also had a similar demo, where code was edited in a single visual workspace. I believe they plan to bring that feature back in a future version. No matter where the idea came from, it seems like a useful layer of indirection.

This might end up working best as a lein plugin, with editors saving their changes “through” it, because the project file layout is tied to lein and other lein plugins. It might also be an interesting experiment to use a git repository as an additional backend, but that would have to happen after a refactoring of the write! function to something more functionally pure.

Learning core.async: A High School Tale

Wednesday, July 24, 2013

The examples use the 2013-07-24 snapshot of core.async. At the time of writing, the API is still in flux and very experimental.

[org.clojure/core.async "0.1.0-SNAPSHOT"]
from:
["sonatype-oss-public" "https://oss.sonatype.org/content/repositories/snapshots/"]

Once upon a time, Alice wanted to invite Bob to an awesome party she had heard about, but she didn’t know where the party is going to be yet. So, she told Bob to wait by his phone, and she would send him the location, when she found out.



Unfortunately for them, the police were monitoring all the text messages. So, the police showed up, about an hour after the party had started, and arrested everyone for underage drinking.


Broadcasting


Alice, now feeling bad about the last party, decided to make it up to Bob and Carol by inviting them both to the next party she heard about. But, this time, she wanted to make sure no one else could see the message. So, she made a Facebook group, and only included Bob and Carol.



Unfortunately for them, the police were monitoring all the Facebook groups and messages. So, the police showed up, about an hour after the party had started, and arrested everyone for underage drinking.


Chaining Go Blocks


Alice decided to try one more time. This time also including Dave, to make up for the last failed party. But, Alice was getting slightly paranoid, so she decided to only tell Bob the location of the party, then have Bob tell Carol, and then have Carol pass it on to Dave.



Unfortunately for them, Dave got pulled over by the police, for a broken tail light, on his way to the party. Being within one hundred miles of the border, the police searched his car and found the handwritten note from Carol. So, the police showed up, about an hour after the party had started, and arrested everyone for underage drinking.


Timeouts


With this being the third time Alice, Bob, Carol, Dave, and Eve had appeared in his court, Judge Frank was ready to throw them in jail. In each case he decided to sentence them to several months in jail, based on their honesty (represented by a random number), with the option of letting them out early on good behavior (represented by an extra channel for each person).




Takeaways:

  1. Go blocks will block and execute in their own thread pool or, for the cljs version, their own state-machine-context-like-thing.
  2. The broadcast and multiplex (a.k.a. fan-out and fan-in) currently exist only as experiments. I hope they get rolled into the library, because I don’t want to have to write them myself.
  3. There is no “wait for all channels to have a value” function. Alts! and Multiplex return on the first channel to have a value.
  4. It looks like you could change the underlying channel implementation to something like a ZeroMQ (zmq-async) or RabbitMQ, but i’m not sure if it would actually be a useful abstraction.

Tree storage in Git with JGit and Clojure

Tuesday, May 28, 2013

Summary: You can store parse trees, or ASTs, directly in a Git repo, but then you have new problems.

Git stores a file structure as trees, using Objects for files and Tree Objects for directories. The tools usually ignore empty directories, but the internal object store has no problem handling them. So, every so often, I wonder why someone doesn’t store the full AST, or just the parse tree, in the repository, instead of just the files.

Proof of Concept

https://gist.github.com/Jared314/5655934

To summarize the code, I’m using a grammar, from the Instaparse tests, that groups 'a's and 'b's from the string “aaaaabbbaaaabb”. After parsing the text, I apply an Enlive transformation that removes the A nodes from the parse tree. I then render the result back into a tree, made of lists and hash-maps, to be stored in the Git repo using JGit and a custom JGit WorkingTreeIterator. Inside the custom tree iterator, I lazily convert each node into a JGit Entry with a FileMode of TREE or REGULAR_FILE, depending on if the node has children.
Text > Instaparse > Enlive transform > Render tree > Custom TreeIterator > AddCommand

The Enlive transform and the render function are not required, but, in this case, I wanted to manipulate the tree before saving it.


The Good

  • Style Independent: The stored representation is immune to statements that parse, or even compile, equivalently. The argument of tabs versus spaces, variable declaration alignment, and other, slightly pedantic, coding style arguments become meaningless, because only the parser or compiler’s “view” of the code is stored.

The Bad

  • Hard Version Dependency: The stored structure is directly dependent on a compiler version and any bugs or limitations of that version. Assuming the stored tree is tagged with language version information, the language developers would need to release a migration script, a translator, or keep previous versions available for long periods of time, as opposed to just relying on customers to change their own code.
  • Code Generator Required: Assuming only the tree is stored, checking out code from a repository would require a human-readable code generator, like what most decompilers do today. While this would allow for personalized code styling, it is an additional language specific tool to build and maintain.

  • Diff and Merge Tool Limitations: Most mainstream diff and merge tools focus solely on the contents of the stored blobs and assume node order independence. Whether this requires different algorithms or a different workflow, the tools will have to change to handle large trees and forests.


The Code


TinyCore x64 4.x

Saturday, October 08, 2011

TinyCore linux releases pre-built ISO images for x86 platforms, but leaves remastering them for the x64 platform is up to the user. After banging around for a few hours, I built my own script to handle the remastering task. If you need a more complete tool, the wiki has a write up on remastering with ezremaster, a GUI tool run from within a TinyCore instance.
build-microcore64.sh
  • 2011-10-09 - Edit: Added remastering with ezremaster
  • 2011-10-08 - Initial Post

Useless Realizations

Saturday, April 18, 2009

I was writing a compiler, for my own "fun", and had the realization that C# v3.5 can be very succinct. I had built out my lexer and was roughing out a target lisp-like syntax when I realized it was just like the C# I was writing.

private static Token GetToken(char p, IEnumerable validTokens)
{
return validTokens.FirstOrDefault(item => item.Value.Contains(p));
}

(define GetToken [p validTokens]
(first validTokens (define [item] (contains (value item) p)))
)

I have not decided if this is proof I have lost my imagination or C#'s progress as a language.

IOC Containers - Part 3

Saturday, November 15, 2008

A little thinking after the last post lead me to the idea of persisting objects beyond the scope of a running program, a super singleton. Because an IOC container abstracts the source, and in some ways the lifespan, of an object, there is nothing stopping me from persisting the object and using it between application instances or other systems. The code would never have to know.

I used db4o in my code because of the simplicity, and native object storage. If you have never used db4o, I suggest it.

Also, this provider required explicit disposal, so I went back and implemented IDisposable on the Repository and the SingleServiceProvider.

Usage:

var r = new Repository();

r.RegisterServiceProvider(new SingleServiceProvider());

r.Register<IObjectContainer, SingleServiceProvider>(() => Db4oFactory.OpenFile("test.db"));
r.RegisterServiceProvider(new DB4OSingleServiceProvider(r.GetInstance<IObjectContainer>()));

r.Register<IMyClass, DB4OSingleServiceProvider>(() => new MyClass("a"));

IMyClass m = r.GetInstance<IMyClass>();

r.Dispose();

Code:

public class DB4OSingleServiceProvider : IRepositoryServiceProvider, IDisposable
{
private readonly Dictionary<Type, object> objects = new Dictionary<Type, object>();
private readonly IObjectContainer container;

public DB4OSingleServiceProvider(IObjectContainer container)
{
this.container = container;
}

~DB4OSingleServiceProvider() { this.Dispose(false); }

public void Remove(Type targetType)
{
var resultSet = this.container.Query(targetType);
while (resultSet.HasNext())
{
this.container.Delete(resultSet.Next());
}
this.container.Commit();
}

public void Update(object o)
{
this.container.Store(o);
}

#region IRepositoryServiceProvider Members

public void RegisterService(Type targetType, FactoryDelegate factoryMethod)
{
var resultSet = this.container.Query(targetType);
if (resultSet.Count < 1)
{
this.container.Store(factoryMethod());
}
}

#endregion

#region IServiceProvider Members

public object GetService(Type serviceType)
{
return this.container.Query(serviceType).Next();
}

#endregion

[ IDisposable Members … ]
}

I have not resolved how to handle events when the object has not been re-hydrated or dealing with multiple applications touching the same object database. Perhaps I will think about that some other time.

The biggest thing I have learned from this is that building something myself helps me understand. Now I am going to to try Autofac.

IOC Containers - Part 2

My initial IOC container was an automatic factory. When GetInstance was called it executed the stored factory method and returned. You could technically put anything you wanted into the factory method, including a singleton instance or a database call, but it is no better than a standard factory method.

While looking at some of the IOC containers listed in my last post, I noticed the user could select the lifespan of each type. So, I went back to the workbench.



I have abstracted out the lifespan from the repository, through the IRepositoryServiceProvider interface. This change also opens up the possibility for creating custom lifespans.

Usage:

var r = new Repository();
r.RegisterServiceProvider(new FactoryServiceProvider());
r.Register<IMyClass, FactoryServiceProvider>(() => new MyClass("a"));
IMyClass m = r.GetInstance<IMyClass>();

Code:

public delegate object FactoryDelegate();
public delegate T FactoryDelegate<T>();

public interface IRepositoryServiceProvider : IServiceProvider
{
void RegisterService(Type targetType, FactoryDelegate factoryMethod);
}

public class RepositoryPlusOne
{
private readonly Dictionary<Type, IRepositoryServiceProvider> providers =
new Dictionary<Type, IRepositoryServiceProvider>();
private readonly Dictionary<Type, IRepositoryServiceProvider> types =
new Dictionary<Type, IRepositoryServiceProvider>();

public void Register<T, U>(FactoryDelegate<T> factoryMethod)
where T : class
where U : IRepositoryServiceProvider
{
Type u = typeof(U);
Type t = typeof(T);

if (!providers.ContainsKey(u)) throw new ArgumentException("Unregistered Provider", "U");

providers[u].RegisterService(t, () => factoryMethod());
types.Add(t, providers[u]);
}

public void RegisterServiceProvider(IRepositoryServiceProvider provider)
{
providers.Add(provider.GetType(), provider);
}

public TService GetInstance<TService>() where TService : class
{
Type t = typeof(TService);
if (!this.factories.ContainsKey(t)) return default(TService);
return this.factories[t]() as TService;
}
}

public class FactoryServiceProvider : IRepositoryServiceProvider
{
private readonly Dictionary<Type, FactoryDelegate> factories =
new Dictionary<Type, FactoryDelegate>();

#region IRepositoryServiceProvider Members

public void RegisterService(Type targetType, FactoryDelegate factoryMethod)
{
factories.Add(targetType, factoryMethod);
}

#endregion

#region IServiceProvider Members

public object GetService(Type serviceType)
{
return (factories.ContainsKey(serviceType)) ? factories[serviceType]() : null;
}

#endregion
}

public class SingleServiceProvider : IRepositoryServiceProvider
{
private readonly Dictionary<Type, object> objects =
new Dictionary<Type, object>();

#region IRepositoryServiceProvider Members

public void RegisterService(Type targetType, FactoryDelegate factoryMethod)
{
objects.Add(targetType, factoryMethod());
}

#endregion

#region IServiceProvider Members

public object GetService(Type serviceType)
{
return (objects.ContainsKey(serviceType)) ? objects[serviceType] : null;
}

#endregion
}

IOC Containers

Friday, November 14, 2008

Inversion of control is an interesting pattern, but it has a flaw. When building a system following the IOC pattern, the code bloat seems to bubble up to the top. The higher level objects start creating and managing all the required parts. This starts to fall down when you scale it up beyond a small number of objects. The accepted solution is to move most of the object creation into an IOC container, like Windsor, StructureMap, Autofac, etc.

You register a type, and dependencies, with the IOC container and then just ask for it when you need it. That is great, except I don't know how or why they work the way they do. Most people have responded to my questions with the classic open source creed, “you can look at the source.” Reading the source is like reading a scientific research paper. So, I tried building one myself.

public delegate object FactoryDelegate();
public delegate T FactoryDelegate<T>();

public class Repository
{
private readonly Dictionary<Type, FactoryDelegate> factories =
new Dictionary<Type, FactoryDelegate>();

public void Register<T>(FactoryDelegate<T> factoryMethod) where T : class
{
this.factories.Add(typeof(T), () => factoryMethod());
}

public TService GetInstance<TService>() where TService : class
{
Type t = typeof(TService);
if (!this.factories.ContainsKey(t)) return default(TService);
return this.factories[t]() as TService;
}
}

After a few hours of tinkering, my basic IOC container is 16 lines, including error handling.

Usage:

var r = new Repository();
r.Register<IMyClass>(() => new MyClass("a"));
IMyClass m = r.GetInstance<IMyClass>();

A factory method is registered for each type you want the Repository to handle. Each time the GetInstance method is called, it retrieves the stored factory method, executes it, and returns a strongly typed result. I think I will explore this a little more.

Musicals and Theater

Saturday, July 19, 2008

The theater production and the musical are not dead. But I do think they are making themselves inaccessible. Doctor Horrible's Sing-Along Blog shows a promising path.

I would watch more theater and musicals if it was as accessible as a music video on YouTube.

I will welcome the day this bandwagon comes.

The Role of Web Browsers

Friday, July 11, 2008

I found an old blog post from a greggles. I don't know who he is, or what he does, but he has a good point. The web application inside the browser is ideal for simple applications, but starts to suffer with higher complexities. Now, you could argue, any number of client side frameworks(AIR, Silverlight 2, Gears, etc) would solve this problem, but it still raises a question on the role of web browsers as a whole.

History Lesson:
The web browser was designed as a document viewer for files on a remote server. Every browser and standard was built to support that. With time, people grew tired of bread and water. So, now we have a "rich" ecosystem of web applications hosted at every point on the globe.

Analysis:
If you generalize enough, the browser, as it stands today, is a platform for developers to write applications that are compatible with multiple OSes, have built in deployment, change management, and sandboxing,
and are ubiquitous. As a programmer, I know all the features listed are surmountable with existing programming techniques and focused effort.

My Question:
Should the web browser continue to take on this role as programming platform?

My Conclusion:
The logical path is to have the web browser evolve into a pure data format viewer component and run applications inside an isolated virtual machine. With an automated system for synchronizing code and data, all the points from above can be solved while allowing for the full use of libraries, tools, and languages available to system programmers.

Now I just have to figure out how to make it easy to use and build it for free.

[Edit] (2008.07.20) I just read the wikipedia entry for application virtualization expressing a better point. Oops. Next time I will think about researching something first.