Using Riak 2.x on Mac OS X with 7DBin7Weeks

I've been trying to work through Seven Databases in Seven Weeks, and have run into problems a couple of times installing Riak and given up. I finally got through it. One issue is that the book uses Riak 1.x, whereas Riak 2.x is the current version. Also, the default ports and commands have changed a bit, so the commands in the book don't work.

The key part to getting Riak running is to install the correct version of both Erlang and Riak, both from source.

Installing Riak requires the custom Erlang from Basho. Don't install Erlang using OS X's brew (you can only get version 17) or from the standard Erlang bistro (you get R16A). You must install R16B02 from Basho per the instructions at here. I added '–prefix=/usr/local/otp_R16B02' to the 'configure' command, and then after installing added it to my path with 'export PATH=$PATH:/usr/local/otp_R16B02/bin' in .bash_profile.

If you try to use Erlang R17 installed by brew, you get a version mismatch error when trying to build Riak (17 instead of R16B). Manually changing this will not work either, because Riak actually uses some custom flags not available in the standard Erlang tooling.

If you try to use the standard Erlang R16A package, you will get an error on the console like:

xxx:riak-2.0.4 phil.varner$ dev/dev1/bin/riak start
!!!!
!!!! WARNING: ulimit -n is 32768; 65536 is the recommended minimum.
!!!!
riak failed to start within 15 seconds,
see the output of 'riak console' for more information.
If you want to wait longer, set the environment variable
WAIT_FOR_ERLANG to the number of seconds to wait.

and then from riak console

LT-A10-122189:riak-2.0.4 phil.varner$ dev/dev1/bin/riak console
config is OK
-config /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/data/generated.configs/app.2015.02.20.23.56.47.config -args_file /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/data/generated.configs/vm.2015.02.20.23.56.47.args -vm_args /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/data/generated.configs/vm.2015.02.20.23.56.47.args
!!!!
!!!! WARNING: ulimit -n is 32768; 65536 is the recommended minimum.
!!!!
Exec:  /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/bin/../erts-5.10/bin/erlexec -boot /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/bin/../releases/2.0.4/riak               -config /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/data/generated.configs/app.2015.02.20.23.56.47.config -args_file /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/data/generated.configs/vm.2015.02.20.23.56.47.args -vm_args /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/data/generated.configs/vm.2015.02.20.23.56.47.args              -pa /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/bin/../lib/basho-patches -- console
Root: /Users/phil.varner/Downloads/riak-2.0.4/dev/dev1/bin/..
bad scheduling option -sfwi
Usage: beam.smp [flags]...

The error "bad scheduling option -sfwi" is because it's trying to use the custom flags in Basho's Erlang, which obviously don't exist in the standard Erlang tools.

After installing the Basho Erlang, install Riak from here and follow the instructions

You'll probably need to change your ulimit settings also, for which there are excellent instructions here

There are quite a few changes in Riak 2.x from Riak 1.x used in the book. The Basho Riak Five-Minute Install has good examples of the new syntax.

The riak-admin command has changed since 7DBin7Weeks p2.0 was published, so the 'join' command is now like 'dev/dev2/bin/riak-admin cluster join dev1@127.0.0.1', and it only stages joins, without actually affecting them. You can see the staged changes with 'dev/dev1/bin/riak-admin cluster plan' and affect them with 'dev/dev2/bin/riak-admin cluster commit'.

The default port for the HTTP interface is now 10018 instead of 8091, so all of the URLs are now like 'http://localhost:10018/stats'.

I also ran into an issue where curl couldn't connect to the Riak port 10018 using "localhost", but directly as 127.0.0.1 worked fine.

Using Apache CXF 2.7, Struts2 2.3, and ASM 5 with Maven

In an application I work on, we recently upgraded Struts2 to 2.3.20. We then started getting exceptions at runtime with the cause of:

Caused by: java.lang.VerifyError: class
net.sf.cglib.core.DebuggingClassWriter overrides final method
visit.(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V

The problem here is that in cglib 2.2.2, net.sf.cglib.core.DebuggingClassWriter extends the ASM class org.objectweb.asm.ClassVisitor and overrides the visit method. The only works in ASM 3, since in ASM 4 and 5 there's a different pattern of extension to follow, and the visit method is final.

So the dependencies are

  • CXF 2.7.14 depends on cglib 2.2 and ASM 3.3.1
  • Struts2 2.3.20 depends
    on cglib-nodeps 2.1 and ASM 5.0.2
  • Our webapp has a dep on cglib-nodeps 2.2.2

I first tried downgrading everything, but ran into some issues with Java 8 bytecode with ASM 3 and 4, so we had to upgrade to ASM 5. The only use of ASM is in xwork-core by com.opensymphony.xwork2.util.finder.ClassFinder, which is only used by
PackageBasedActionConfigBuilder in the "convention" plugin, which my application doesn't use, so theoretically if we were compiling to target 1.6 or 1.7 we could downgrade that to ASM 3 or 4. When struts2 2.3.22 comes out, it looks like the resolution of this issue CXF broken with upgrade of ASM 5 since Struts 2.3.20 will allow that.

Using CXF and our app with cglib 3.1 and ASM 5 seems to be working fine, at least so far. All of the CXF tests (apart from a few of the CORBA ones) passed with ASM 5.

Updated the parent pom's properties and dependencyManagement dependencies to use the right versions and exclude the wrong ones. One change was that we previously used the cglib-nodeps artifact that shadow jars the ASM classes into net.sf.cglib.asm instead of having the real ASM artifact as a transitive dependency.

Add properties for the versions, since there are a few dependencies for each one:

<properties>
    <cxf.version>2.7.14</cxf.version>
    <struts2-version>2.3.20</struts2-version>
    <asm.version>5.0.2</asm.version>
    <cglib.version>3.1</cglib.version>
</properties>

In the parent pom, set the versions to use and the transitive exclusions for the wrong ones:

<dependencyManagement>
...
        <dependency>
            <groupId>org.apache.cxf</groupId>
            <artifactId>cxf-bundle</artifactId>
            <version>${cxf.version}</version>
            <exclusions>
                <exclusion> <!-- exclude version 3 of asm -->
                    <groupId>asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
           ...
      </dependency>

        <dependency> 
            <groupId>org.apache.struts</groupId>
            <artifactId>struts2-core</artifactId>
            <version>${struts2-version}</version>
            <exclusions>
                <exclusion>
                    <groupId>javassist</groupId>
                    <artifactId>javassist</artifactId>
                </exclusion>
                <exclusion> <!-- exclude version 2.1-->
                    <groupId>cglib</groupId>
                    <artifactId>cglib-nodep</artifactId>
                </exclusion>
                <exclusion> <!-- we prefer our explicit version, though it should be the same -->
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion> <!-- we prefer our explicit version, though it should be the same -->
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm-commons</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
...
        <dependency>
            <groupId>cglib</groupId>
            <artifactId>cglib</artifactId>
            <version>${cglib.version}</version>
            <exclusions>
                <exclusion> <!-- exclude v. 4.2-->
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion> <!-- exclude v. 4.2 -->
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm-util</artifactId>
                </exclusion>
                <exclusion> <!-- no ant! -->
                    <groupId>ant</groupId>
                    <artifactId>ant</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

explicitly add the asm version we want:

<dependency>
            <groupId>org.ow2.asm</groupId>
            <artifactId>asm</artifactId>
            <version>${asm.version}</version>
        </dependency>
        <dependency>
            <groupId>org.ow2.asm</groupId>
            <artifactId>asm-util</artifactId>
            <version>${asm.version}</version>
        </dependency>
        <dependency>
            <groupId>org.ow2.asm</groupId>
            <artifactId>asm-commons</artifactId>
            <version>${asm.version}</version>
        </dependency>

and then in the dependencies section of the build pom, explicitly add cglib and asm:

<dependencies>
...
    <dependency>
        <groupId>cglib</groupId>
        <artifactId>cglib</artifactId>
    </dependency>
    <dependency>
        <groupId>org.ow2.asm</groupId>
        <artifactId>asm</artifactId>
    </dependency>
    <dependency>
        <groupId>org.ow2.asm</groupId>
        <artifactId>asm-util</artifactId>
    </dependency>
    <dependency>
        <groupId>org.ow2.asm</groupId>
        <artifactId>asm-commons</artifactId>
    </dependency>
...
</dependencies>

That seems to be working, but please comment if you have any additions to this!

Why does a tuning fork sound different than a piano, even if they’re playing the same note?

Why does a tuning fork sound different than a piano, even if they’re playing the same note?

The objective aspects of a sound like frequency and amplitude can be measured. A tuning fork and a piano may be playing exactly the note at exactly the same volume, and may be perceived by a listener to have the same pitch and loudness, but the two will never sound "the same" to a listener. The objective and subjective aspects of two sounds that distinguish them from one another is the timbre.

Listen to this recording of several different instruments all playing A440 and recognize how they all sound different:

Timbre is a combination of both the objective physical properties of a sound wave and the subjective psychoacoustic perception of the listener. Objective aspects are those that can be definitively measured, and are usually related to the physical propagation of the sound. There are several of these, but the two most important are:

  1. The instantaneous combinations of frequencies, including the fundamental tone and it's related harmonics (also known as partials or overtones)
  2. The change in the frequency and amplitude over time, typically referred to as the attack, decay, sustain, release (ADSR)

Any physical instrument is not only going to play the fundamental but also harmonics. These harmonics are frequencies in the sound that are integer multiples of the fundamental tone. For examples, the A above middle C has a frequency of 440Hz, so will generate harmonics of 880Hz (x2), 1320Hz (x3), 1720Hz (x4), and on and on. Theoretically they're infinite, but in most practical situations the higher harmonics are to soft to be heard or noticed over the louder lower harmonics.

For example, in this frequency graph of one instant of a Grand Piano playing A440, we see peaks at exactly these integer multiples:

Because all of these frequencies exist at the same time, they combine to form a complex waveform:

So instead of creating a simple sine wave, like a tuning fork or a single cathedral organ pipe does, the harmonics create complex waves that human ears perceive as "interesting".

In addition to the instantaneous frequencies, the waveform can change over time in the ADSR cycle — attack (when the note is first struck), decay (from the initial strike down to the sustain), sustain (the long part of the note), and release (when the note ends).

Attack and Decay

Sustain

Release

You can see the harmonics and the ASDR cycle demonstrated in this video. It looks best in 720p HD and full screen. Watch the frequency graph at the bottom of the video change over time:

These instruments were played in Garage Band, so it's simulating an instrument and is not exactly perfect, but the main point here is that even approximations of real instruments are extremely complicated. It's also interesting to note how some of the "recordings" have very different left and right stereo tracks, which is an attempt by the software instrument to sound more like a real instrument, even though a recording of an actual instrument would likely have exactly the same waveforms on both tracks.

In the audio recordings below, you can hear what sounds with different timbres sound like and compare their frequency graphs and waveforms during the sustain.

Grand Piano

Cathedral Organ

This is exactly the same timbre of a tuning fork, as one is a simple vibrating rod and the other is a simple vibrating column of air.

Grand Organ

Acoustic Guitar

French Horns

Clarinet

Analog Mono Lead

This one is actually very complicated, as the waveform changes in a cycle over a period of seconds.

Electric Buzz

Conclusion

A tuning fork sounds different than a piano because a tuning fork only has one fundamental note that has a uniform waveform throughout it's playing, whereas a piano has complex harmonics and great variation throughout it's attack, decay, sustain, and release, which makes their timbres very different. While it's very complex to analyze a sound wave to break it into it's combinations of frequencies and amplitudes, it's easy for most humans to hear even small differences in timbre. The timbre is determined both by the physical properties of the sound wave that we've described here and the perception of the listener. Timbre is what makes sounds interesting.

Resources

Multi-extends in generified types

In Effective Java, I came across a language construct I'd never seen before:

public class Foo<T extends List & Comparator> { 
    <U extends List & Comparator> void foo(U x) { }
}

This declares that T must extend or implement both List and Comparator. I've never had occasion to use this, but I can imagine it would be useful. The example Bloch gives in the book is when T is derived from one class and implements an interface.

Unicode in Java: some Groovy pieces (part 7)

One of the common tasks Java developers use Groovy for is testing. One of the common idioms I use is the create a list of strings and use the "each" method to assert that an output file contains them. When testing Unicode, this means both the output files and the Groovy source files contain Unicode characters. For example, the code may contain:

        def contents = new File(outputFile).getText("UTF-8")
 
       [ "D'fhuascail Íosa Úrmhac na hÓighe Beannaithe pór Éava agus Ádhaimh",
         'イロハニホヘト チリヌルヲ ワカヨタレソ ツネナラム',
         'เป็นมนุษย์สุดประเสริฐเลิศคุณค่า'
        ].each{ assertTrue(contents.contains(it), "${it} not in ${outputFile}") }

The first point is that we can no longer use the File#text method, we need to use the getText method that takes a character encoding scheme argument.

The second point is when Java or Groovy source files that contain Unicode characters, the specify what the encoding for those files is. In this case, we've saved our source files in UTF-8 encoding. As with JVM, javac and groovyc will default to using the platform default encoding if none is specified, which would give us odd errors when the non-printable ASCII characters that resulted from incorrectly decoding the UTF-8 where fed to the compiler.

When I call groovyc from Ant, this is code I use:

         <groovyc srcdir="." includes="com/example/**/*.groovy" destdir="${twork}" encoding="UTF-8">
            <classpath refid="example.common.class.path"/>
         </groovyc>

For more on Groovy and Unicode, Guillaume has an excellent post Heads-up on File and Stream groovy methods

Unicode in Java: bytes and charsets (part 6)

In this part, I'll discuss some of the lower-level APIs for converting byte arrays to characters and a bit more about the Charset and CharsetDecoder classes.

The string class has two constructors that will decode a byte[] using a specified charset: String(byte[] bytes, String charsetName) and
String(byte[] bytes, Charset charset). Likewise, it has two instance methods for doing the opposite: byte[] getBytes(String charsetName) and byte[] getBytes(Charset charset). It is almost always wrong to to use the String(byte[]) or byte[] getBytes() methods, since these will use the default platform encoding. It is nearly always better to choose a consistent encoding to use within your application, typically UTF-8, unless you have a good reason to do otherwise.

In the previous part, we used the Charset class to retrieve the default character encoding. We can also use this to retrieve the Charset instance for a given string name with the static method Charset.forName(String charsetName), e.g., Charset.forName("UTF-8"). In addition to String having methods that take either a string name of the encoding or the Charset instance, most of the Reader classes do too. In my previous examples I showed using the version where "UTF-8" is specified, but the better way would be to have a final static attribute that contains the value of Charset.forName("UTF-8") and use this. It eliminates the need to repeated look up the Charset and it prevents a type in the charset name from creating a hard-to-find bug.

The CharsetDecoder class is provided for when you need more control over the decoding process than the String methods provide. This definitely falls into the "advanced" category, so I'm not going to cover it here. Aaron Elkiss has a good writeup as does the javadoc

Unicode in Java: sample data (part 5)

When testing Unicode with your application, you need some examples. Most people don't have Thai or Katakana files sitting around, so finding test data is hard.

I've been playing around with JavaScript and JQuery recently, so I thought I'd build a small app that would render Unicode characters from a variety of languages in a variety of scripts. You can cut-and-paste the examples into your own test files, or since the HTML file contain the characters themselves (instead of the HTML escape codes), you could even use the file as as test data. It even has Klingon :)

unicode_app

Marcus Kuhn has a lot of good examples including "quick brown fox" examples in many languages (unfortunately Chinese is not among them).

Unicode in Java: Default Charset (part 4)

In this part, I will discuss the default Charset and how to change it.

The default character set (technically a character encoding) is set when the JVM starts. Every platform has a default default, but the default can also be configured explicitly. For example, Windows XP 32 bit (English) defaults to "windows-1252", which is the CP1252 encoding that provides for encoding most Western European languages.

The default charset can be printed by calling:

System.out.println(java.nio.charset.Charset.defaultCharset());

When the JVM is started, the default charset can be set with the property "file.encoding", e.g., "-Dfile.encoding=utf-8". Some IDEs will do this automatically, for example, NetBeans uses this property to explicitly set the charset to UTF-8. The drawback to this is that code that uses a class like FileReader that relies on the default encoding may work correctly when handling Unicode in the development environment, but then break when used in an environment that has a different default encoding. The developer should not rely on the user to set the encoding for the code to work correctly.

Also, one might think they could just alter the system property "file.encoding" programmatically. However, this cannot be set after the JVM starts, as by that time all of the system classes which rely on this value have already cached it.

In Linux/Unix, you can also set the LC_ALL to affect the default encoding. For example, on one Linux box I have, the default is US-ASCII. When I set "export LC_ALL=en_US.UTF-8", the default encoding is UTF8.

The environment variables LANG and LC_CTYPE will also have a similar affect (more here).

In summary, the default charset is used by many classes when a character set is not explicitly specified, but this charset should not be relied upon to work correctly when your application is supposed to handle Unicode.