Software Development – Eric Tucker

Ruby Developers: Manage a Multi-Gem Project with RuntimeGemIncluder (Experimental Release)

A couple of years ago in the dark ages of Ruby, one created one Gem at a time, hopefully unit tested it and perhaps integrated it into a project.

Every minute change in a Gem could mean painstaking work often doing various builds, includes and/or install steps over and over. No more!

I created this simple Gem (a Gem itself!) that at run-time builds and installs all Gems in paths matching patterns defined by you.

I invite brave souls to try it out this EXPERIMENTAL release now pending a more thoroughly tested/mature release. Install RuntimeGemIncluder, define some simple configuration in your environment.rb or a similar place and use require as you normally would:

Here’s an example I used to include everything in my NetBeans workspace with JRuby.

Download the Gem from http://rubyforge.org/frs/?group_id=9252

To install, go to the directory where you have downloaded the Gem and type:

gem install runtime-gem-includer-0.0.1.gem

(Soon you may be able to install directly from RubyForge by simply typing ‘gem install runtime-gem-includer‘.)

Some place before you load the rest of your project (like environment.rb if you’re using Rails) insert the following code:

trace_flag = "--trace" $runtime_gem_includer_config = { :gem_build_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S rake #{trace_flag} gem", :gem_install_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S gem install", :gem_uninstall_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S gem uninstall", :gem_clean_cmd = "\"#{ENV['JRUBY_HOME']}/bin/jruby\" -S rake clean", :force_rebuild = false, :gem_source_path_patterns = [ "/home/erictucker/NetBeansProjects/*" ], :gem_source_path_exclusion_patterns = [] } require 'runtime_gem_includer'

If you are using JRuby and would like to just use the defaults, the following code should be sufficient:

$runtime_gem_includer_config = { :gem_source_path_patterns = [ "/home/erictucker/NetBeansProjects/*" ], :gem_source_path_exclusion_patterns = [] } require 'runtime_gem_includer'

Now simply in any source file as you normally would:

require 'my_gem_name'

And you’re off to the races!

Gems are dynamically built and installed at runtime (accomplished by overriding Kernel::require). Edit everywhere, click run, watch the magic! There may be some applications for this Gem in continuous integration. Rebuilds and reloads of specified Gems should occur during application startup/initialization once per instance/run of your application.

Interested in source, documentation, etc.? http://rtgemincl.rubyforge.org/

More Efficient Software = Less Energy Consumption: Green Computing isn’t just Hardware and Virtualization

Originally published 16 November 2009.

Green is a great buzzword, but the real-world driver for many “green” efforts is cost. Data center power is expensive. Years ago, Oracle moved a major data center from California to my town Austin, Texas. A key reason: more predictably priced, cheaper power in Texas vs. California. What if Oracle could make the data center half the size and take half the power because its software ran more efficiently?

Your bank, your brokerage, Google, Yahoo, Facebook, Amazon, countless e-commerce sites and more often require surprisingly many servers. Servers have traditionally been power-hungry things favoring reliability and redundancy over cost and power utilization. As we do more on the web, servers do more behind the scenes. The amount of computing power or various subsystem capabilities required varies drastically based on how an application works.

These days, hardware vendors across the IT gamut try to claim their data center and server solutions are more power efficient. The big push for consolidation and server virtualization (the practice by which one physical server functions as several virtual servers which share the hardware of the physical machine) does make some real sense. In addition to using less power, such approaches often simplify deployment, integration, management and administration. It’s usually easier to manage fewer boxes than more, and the interchangeability facilitated by things like virtualization combined with good planning make solutions more flexible and able to more effectively scale on demand.

Ironically, the issue people seem to pay the least attention to is perhaps the most crucial: the efficiency of software. Software orchestrates everything computers do. The more computer processors, memory, hard drives and networks do, the more power they need and the bigger or more plentiful they must be. One needs more servers or more power burning servers the more operations those servers must perform. The software is in charge. When it comes to operations the computer performs, the software is both the CEO and the mid-level tactical managers that can make all the difference in the world. If software can be architected, coded or compiled to be manage more efficiently the operations per unit of work produced goes down. Every operation saved means power saved.

Computers typically perform a lot of overly redundant or otherwise unneeded operations. For example, a lot of data is passed across the network not because it absolutely needs to be, but because it’s easier for a developer to build an app that operates that way or the application to be implemented that way in production. There are applications that use central databases for caches when a local in-memory cache would not only be orders of magnitude faster but also burn less power. Each time data goes across a network it must be processed on each end and often formatted and reformatted multiple times.

A typical web service call (REST, SOAP, etc) – the so-called holy grail of interoperability, modularity and inter-system communication in some communities – is a wonderful enabler, but it does involve parsing (e.g. turning text data into things the computer understands), marshalling (a process by which data is transformed typically to facilitate transport or storage) and often many layers of function calls, security checks and other things. The use of web services is not inherently evil, but far more carbon gets burned to make a web service call to a server across the country or even inches away than it is for the computer to talk to its own memory. It’s also a lot slower.

Don’t get me wrong, I’m a big believer in the “army of ants” approach. However, I see the next big things in power utilization being software driven. We’re going to reach a point where we’ve consolidated all we reasonably can, and at that point it’s going to be a focus on making the software more efficient.

If my code runs in a Hadoop-like (Hadoop is open source software that facilitates computing across many computers) cluster and the framework has tremendous overhead compared to what I’m processing, how much smaller could I make the cluster if I could remove that overhead? What if I process more things at once in the same place? What if I batch them more? What if I can reduce remote calls? What if I explore new languages like Go with multi-core paradigms? What about widely deployed operating systems like Linux, Windows and MacOS become more power efficient. What about widely used apps consuming less power hungry memory? What about security software taking fewer overhead CPU cycles? Can we use multi-core processing more efficiently?

In most cases, performance boosts and power savings go hand-in-hand. Oriented toward developers, here are a few more obvious areas for improvement. Most are pre-existing good software design practices:

– Caching is the first obvious place: (1) more caching of information, (2) less reprocessing of information, (3) more granular caching to facilitate caching where it was not previously done.

– Data locality: Do processing as close to where data resides as possible to reduce transportation costs. Distance is often best measured not in physical distance but in the number of subsystems (both hardware and software) that data must flow through.

– Limit redundant requests: Once you have something retrieved or cached locally, use it intelligently: (1) collect changes locally and commit them to a central remote location such as a database only as often as you need to, (2) use algorithms that can account for changes without synchronizing as often with data on other servers.

– Maximize use of what you have: A system is burning power if it’s just on. Use the system fully without being wasteful: (1) careful use of non-blocking (things that move on instead of having the computer wait for a response from a component) operations in ways that let the computer do other things while it’s waiting; (2) optimize the running and synchronization of multiple processes to balance use, process duration and inter-process communication such that the most work gets done with least waiting or overhead.

– Choose the language, platform and level of optimization based on amount of overall resources consumed: Use higher performance languages or components and more optimizations for sections which account for the most resource utilization (execution time, memory use, etc.). Conversely, use easier to build or cheaper components that account for less overall resource use so that more focus can go to critical sections. (I do this in practice by mixing Ruby, Java and other languages inside the JRuby platform.)

In certain applications, maybe we don’t care about power utilization or much at all about efficiency, but as applications become increasingly large and execute across more servers development costs in some scenarios may become secondary to computing resources. Some goals are simply not attainable unless an application makes efficient use of resources, and that focus on efficiency may pay unexpected dividends.

Developers especially of large-scale or widely deployed applications, if we want to be greener let’s focus on run-times, compilers and the new and the yet-to-be-developed paradigms for distributed massively multi-core computing.

There is a story that Steve Jobs once motivated Apple engineers to make a computer boot faster by explaining how many lifetimes of waiting such a boost might save. Could the global impact of software design be more than we imagine?

[ Edit 14 October 2018: See also: Math Pierces Steve Jobs’ Reality Distortion Field after 35 Years. ]

Why Ruby on Rails + JRuby over PHP: My Take, Shorter Version

As a Ruby, Java and occasional C/C++ developer who has also written some production code in PHP, I work with and tend to prefer the power and flexibility provided by a JRuby + NetBeans + Glassfish stack over PHP. Here is my attempt to somewhat briefly describe not only why but also to encourage others to develop in RoR vs. PHP:

Pros

– Exceptionally high developer productivity with:

“Programming through configuration” philosophy
Emphasis on rather complete default behaviors
Write-once (DRY) orientation
Simple ORM (ActiveRecord) means a lot less SQL with minimal fuss
Dynamically typed language means a lot less thinking about variable declarations
Result: A lot less grunt work; more focus on “real work”

– Strongly encourages clean MVC architecture

– Test frameworks

TestUnit is easy to use and effective
Enables test driven development (TDD) often omitted in PHP world
UI mocking frameworks are available

– Pre-packaged database migrations feature eases schema creation and changes

Helper methods further simplify and aid to avoid writing SQL
Roll back or forward to arbitrary versions

– Significant pre-packaged forms and JavaScript/AJAX UI support

– Ruby language easy to learn and more versatile

Like PHP, Ruby language’s initial learning curve is much easier than Java, C#, etc.
Like PHP, Ruby language conducive to scripting as well as slightly better OOP support
Ruby language skills can be leveraged for use in environments outside web applications

– Vendor support by Sun Micro

Dedicated team and significant JRuby project
Good support in NetBeans IDE
Quality Glassfish app server from JEE world
Provides integrated NetBeans, Glassfish, JRuby stack in one download

– Tap JEE power from within Ruby

JRuby allows fairly seamless access to Java and JEE libraries and features as well as your own Java code should you desire
Result: You can start simple without being boxed in, and you can later add a lot of enterprise-grade sophistication.

– Community

Contains a lot of talent from JEE world
Libraries that implement simpler versions of enterprise-oriented features
Community tends to be rather friendly and inclusive

Cons

– Maturity

Despite making huge strides, acceptance remains low at more conservative companies
Hosting options limited in comparison to PHP
- Dedicated server or VPS
- Amazon EC2
- Smaller pool of shared hosts
The ORM can be a memory hog
Fewer jobs open due to fewer projects (job to applicant ratio might be greater though?)
Fewer sysadmins and established maintenance procedures
Less support, fewer developers to maintain RoR apps

– LAMP-like scalability limitations for conventional architecture are comparable or more resource intensive than most PHP solutions

– Of course, if venturing heavily into cross-platform JEE territory the learning curve steepens dramatically

LINK: Interview about Data Grids by Ryan Slobojan w/ Cameron Purdy, VP Development at Oracle

Interview with Cameron Purdy, VP Development at Oracle, about data grids. Interesting insights, and several things I’ve been saying for a good while. 🙂

http://www.infoq.com/interviews/Data-Grid-Cameron-Purdy

OpenCL – common framework for CPU+GPU computing

Very interesting:

“OpenCL is a programming framework that allows software to run on both the CPU and the graphics processor of the computer.”

“…earlier this year Apple offered OpenCL to the Khronos Group, a standards-setting organization, and Intel, Nvidia and AMD joined forces to create a standard that would work on multiple chips.”

source: http://gigaom.com/2008/12/26/opencl-gives-your-computer-wings/

Thanks JranDe for the heads up on this.

Wikipedia has a code example:
http://en.wikipedia.org/wiki/OpenCL#Example

Introducing COHESION – highly automated open source ORM for Java — CALLING CONTRIBUTORS!

My first open source project …

Think features of ActiveRecord + Hibernate + a little more – limitations on some data structures. Designed for maximum ease and speed of development for common applications.

The goal:

orm.save( classInstance1 );
// boom! – cohesion creates a table(s) if need be
// record gets saved

// look ups by example class
classInstanceExample.setName( “Sparky” );
classInstance2 = orm.load( classInstanceExample );

// or by field names and values
Map m = new HashMap();
m.put( “name”, “Sparky” );
classInstance3 = orm.find_by_example( m );

Looking to bring ActiveRecord-like functionality to a Java platform with the added Hibernate style bonus of being able to generate a schema automagically … but even better … on the fly from the class using reflection. Unlike Hibernate – no annotations or schema definitions necessary.

Cohesion is not a clone or port of ActiveRecord or Hibernate but meant to provide similar functionality drawing on the strengths and lessons of each of these very important and powerful projects. At least for a good while, I do not anticipate Cohesion will provide the same performance as more mature products like Hibernate, but it will be easier to code with.

I have preliminary code for doing lists using joins and definitely borrowing some ideas from Hibernate. Barring some pretty big contributions from others, I expect some limitations on more complex data structures at least in any early versions.

SourceForge project:
http://www.sourceforge.net/projects/cohesion

Browse source:
http://cohesion.svn.sourceforge.net/viewvc/cohesion

This is code I started on last year and decided recently to open source currently under an Apache 2 license. I’m open to some discussion on licenses.

If this project makes it to maturity it could provide a very widely used fundamental building block for a lot of development and improve productivity in a lot of places.

Also a founding member, Matthew Molinyawe will be working on this project with me.

Please do comment/drop me or Matt a line if you wish to contribute.