Dremio

Recently I’ve been pouring much professional time into learning the internal workings of Dremio. For those that have not heard about this tool, you can read much more about it on their web site.

Learning Maven and the packages related to the dremio OSS application have been quite an effort. Oh sure, I’ve done the occasional hello world java app, but nothing this big. I cannot say that I’m even close to learning it all, but at least I have a better understanding on how to build, revise and debug it.

Build from Dremio-oss

I’m going to walk through my build steps for dremio-oss. Starting, for the most part, from a bare machine up to the point where the server is up and running for the first time.

I’ll be using Google Cloud Platform (gcp). (Not that it should matter… and I won’t mention it unless it becomes relevant…)

I say “for the most part” because I didn’t want to re-pull all the java packages that are needed for this. So this build example assumes that you have a fairly robust maven repository (~/.m2/repository).

System Setup

Here is my system setup.

> echo $PATH
/usr/local/bin /usr/bin /bin /usr/games

> uname -a
Linux go-devel 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64 GNU/Linux

> java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-8u212-b01-1~deb9u1-b01)
OpenJDK 64-Bit Server VM (build 25.212-b01, mixed mode)

> mvn -version
Apache Maven 3.3.9
Maven home: /usr/share/maven
Java version: 1.8.0_212, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-8-openjdk-amd64/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.9.0-8-amd64", arch: "amd64", family: "unix"

Ugh, I didn’t realize this was such a small machine – but I guess it not costing me much:

Machine type:

n1-standard-1 (1 vCPU, 3.75 GB memory)

> free -g
              total        used        free      shared  buff/cache   available
Mem:              3           0           3           0           0           3
Swap:             0           0           0

Git clone

I cloned from dremio-oss git repo.

To verify, here is the last Dremio checkin b0b8fd335d8f288f2b6ede710cccdfd40b969e67.

Build

The README.md in the root directory has some good notes on how to build.

To kick off a build I used:

mvn clean install -DskipTests -T 4

Note that this will take quite a bit of time the first time through as maven will be pulling in all dependency packages that are needed for the build. To give you an idea of that load, here is the size of the resulting maven repository directory:

 ~/.m2> cd ~/.m2
 ~/.m2> du -sh .
1.1G    .

I’m impatient so I added -T 4 – even though it probably is not going to make much a difference on this small development box.

Here is the build footer. Again, this is on a subsequent build so all additonal packages have already been loaded.

[INFO] Plugins - S3 Plugin ................................ SUCCESS [ 10.907 s]
[INFO] Plugins - Elasticsearch Plugin ..................... SUCCESS [ 19.672 s]
[INFO] Plugins - Mongo Plugin ............................. SUCCESS [  5.553 s]
[INFO] Plugins - Hive Plugin .............................. SUCCESS [ 24.667 s]
[INFO] Plugins - HBase Plugin ............................. SUCCESS [ 10.760 s]
[INFO] Plugins - JDBC Plugin .............................. SUCCESS [  5.522 s]
[INFO] Analyst Center - daemon ............................ SUCCESS [  5.729 s]
[INFO] Distribution - Parent .............................. SUCCESS [  3.301 s]
[INFO] Distribution - Scripts and Resources ............... SUCCESS [  3.930 s]
[INFO] Distribution - All-in-one JDBC JAR ................. SUCCESS [01:17 min]
[INFO] Distribution - Server Packaging .................... SUCCESS [02:03 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:13 min (Wall Clock)
[INFO] Finished at: 2019-04-15T15:32:19+00:00
[INFO] Final Memory: 362M/864M
[INFO] ------------------------------------------------------------------------

Startup challenges

The README.md describes two approaches to running…

distribution/server/target/dremio-community-{DREMIO_VERSION}/dremio-community-{DREMIO_VERSION}/bin/dremio start

The first command works as expected, however everything is running out of a sub-directory. This makes it difficult to revise, compile & debug with an IDE, such as intellij or eclipse.

So I opted for the alternative recommendation to use this command.

mvn compile exec:exec -pl dac/daemon

Unfortunately, I found that this approach was blocked with a few startup related errors. Starting the server for the first time with this command:

dremio-oss> mvn exec:exec -pl dac/daemon

Resulted in the following exception:

15:59:36.324 [Curator-ServiceCache-0] INFO  c.d.s.c.TaskLeaderStatusListener - New TaskLeader node for MASTER go-devel.us-east1-b.c.gcptraining-221417.internal:45678 registered itself.
15:59:36.979 [main] ERROR ROOT - Dremio is exiting. Failure while starting services.
java.lang.RuntimeException: Failure while attempting to create com.dremio.dac.service.collaboration.CollaborationHelper.
        at com.dremio.service.BinderImpl$InjectableReference.get(BinderImpl.java:427) ~[dremio-common-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.service.BinderImpl.lookup(BinderImpl.java:109) ~[dremio-common-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.service.BinderImpl$DeferredProvider.get(BinderImpl.java:83) ~[dremio-common-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.dac.daemon.SampleDataPopulatorService.start(SampleDataPopulatorService.java:97) ~[dremio-dac-backend-3.1.9-201904051346520183-a35b753.jar:na]

On subsequent startups, I also ran into this error:

17:02:01.452 [main] INFO  c.d.exec.catalog.CatalogServiceImpl - User Error Occurred [ErrorId: fc92a85e-f0b1-4306-8f89-0c9bb0ed8cdf]
com.dremio.common.exceptions.UserException: Unknown source INFORMATION_SCHEMA
        at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:746) ~[dremio-common-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.exec.catalog.CatalogServiceImpl.refreshSource(CatalogServiceImpl.java:332) [dremio-sabot-kernel-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.exec.catalog.CatalogServiceImpl.start(CatalogServiceImpl.java:168) [dremio-sabot-kernel-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.service.SingletonRegistry$AbstractServiceReference.start(SingletonRegistry.java:137) [dremio-common-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.service.ServiceRegistry.start(ServiceRegistry.java:74) [dremio-common-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.service.SingletonRegistry.start(SingletonRegistry.java:33) [dremio-common-3.1.9-201904051346520183-a35b753.jar:3.1.9-201904051346520183-a35b753]
        at com.dremio.dac.daemon.DACDaemon.startServices(DACDaemon.java:184) [dremio-dac-backend-3.1.9-201904051346520183-a35b753.jar:na]
        at com.dremio.dac.daemon.DACDaemon.init(DACDaemon.java:190) [dremio-dac-backend-3.1.9-201904051346520183-a35b753.jar:na]
        at com.dremio.dac.daemon.DremioDaemon.main(DremioDaemon.java:102) [classes/:na]

After some digging, I found that the exec:exec goal is referencing this command in the dac/daemon/pom.xml:

        <executions>
          <!--                                                                                                                                   To run a test server:
		       mvn org.codehaus.mojo:exec-maven-plugin:exec
	      -->
          <execution>
            <id>default-cli</id>
            <configuration>
              <classpathScope>test</classpathScope>
              <executable>java</executable>
              <arguments>
                <argument>-Xms512m</argument>
                <argument>-Xmx4g</argument>
                <argument>-XX:MaxDirectMemorySize=1g</argument>
                <argument>-Ddremio_debug=true</argument>
                <argument>-Djava.io.tmpdir=${project.build.directory}</argument>
                <argument>-Ddac_prepopulate=true</argument>
                <argument>-Ddebug.addDefaultUser=true</argument>
                <argument>-Ddebug.allowTestApis=true</argument>
                <argument>-classpath</argument>
                <classpath/>
                <argument>com.dremio.dac.daemon.DremioDaemon</argument>
              </arguments>
            </configuration>
          </execution>
        </executions>

With some experimentation, I found that the problem is related to the arguement <argument>-Ddac_prepopulate=true</argument>. Apparently dremio is is creating a directory for metadata. Under this execution configuration, the database directory is located in /tmp/dremio.

Fix (for now at least)

Thinking that we’re in database definition case where the metadata database needs to be created on the first pass, then the server can startup in the second pass, I broke the execution command up into two commands – the first to do the dac_prepopulation step and the second skip the dac_prepopulation step. I abandoned the addDefaultUser switch for the time beaing until I could get some positive results.

Here is my replacement executions xml node.

GitLab dac/daemon/pom.xml:

        <executions>
          <!--
		    To run a test server:
			mvn org.codehaus.mojo:exec-maven-plugin:exec
		  -->
          <execution>
            <id>default-cli</id>
            <configuration>
              <classpathScope>test</classpathScope>
              <executable>java</executable>
              <arguments>
                <argument>-Xms512m</argument>
                <argument>-Xmx4g</argument>
                <argument>-XX:MaxDirectMemorySize=1g</argument>
                <argument>-Ddremio_debug=true</argument>
                <argument>-Djava.io.tmpdir=${project.build.directory}</argument>
                <argument>-Ddac_prepopulate=true</argument>
                <argument>-Ddebug.addDefaultUser=false</argument>
                <argument>-Ddebug.allowTestApis=true</argument>
                <argument>-classpath</argument>
                <classpath/>
                <argument>com.dremio.dac.daemon.DremioDaemon</argument>
              </arguments>
            </configuration>
          </execution>
          <execution>
            <id>default-cli-next</id>
            <configuration>
              <classpathScope>test</classpathScope>
              <executable>java</executable>
              <arguments>
                <argument>-Xms512m</argument>
                <argument>-Xmx4g</argument>
                <argument>-XX:MaxDirectMemorySize=1g</argument>
                <argument>-Ddremio_debug=true</argument>
                <argument>-Djava.io.tmpdir=${project.build.directory}</argument>
                <argument>-Ddac_prepopulate=false</argument>
                <argument>-Ddebug.addDefaultUser=false</argument>
                <argument>-Ddebug.allowTestApis=true</argument>
                <argument>-classpath</argument>
                <classpath/>
                <argument>com.dremio.dac.daemon.DremioDaemon</argument>
              </arguments>
            </configuration>
          </execution>
        </executions>

Putting it all together

So to re-set, I delete the database directory /tmp/dremio. then ran the following two commands:

mvn exec:exec@default-cli -pl dac/daemon

This command will end in a ...collaboration.CollaborationHelper stack dump, but it will create the /tmp/dremio directory and presumably define it with a usable schema.

Then run the second command…

mvn exec:exec@default-cli-next -pl dac/daemon

Now that the database has been created, this command succeeds and the server comes up and running.

...
17:24:20.843 [main] INFO  c.dremio.exec.catalog.PluginsManager - Result of storage plugin startup: 
        INFORMATION_SCHEMA: success (2ms). Healthy
        __jobResultsStore: success (338ms). Healthy
        __logs: success (288ms). Healthy
        __home: success (339ms). Healthy
        __support: success (271ms). Healthy
        __datasetDownload: success (331ms). Healthy
        sys: success (0ms). Healthy
        __accelerator: success (343ms). Healthy
        $scratch: success (339ms). Healthy

17:24:20.872 [main] INFO  c.d.service.jobs.LocalJobsService - Starting JobsService
17:24:20.939 [main] INFO  c.d.service.jobs.LocalJobsService - JobsService is up
17:24:21.239 [main] INFO  c.d.p.s.ProvisioningServiceImpl - Starting provisioning service
17:24:21.240 [main] INFO  c.d.exec.server.NodeRegistration - Starting NodeRegistration for go-devel.us-east1-b.c.gcptraining-221417.internal:31010
17:24:21.280 [main] INFO  c.d.exec.server.NodeRegistration - NodeRegistration is up for go-devel.us-east1-b.c.gcptraining-221417.internal:31010
17:24:21.315 [main] INFO  c.d.d.s.search.SearchIndexManager - Search manager created, last wakeup time was 0
17:24:21.446 [Curator-ServiceCache-0] INFO  c.d.s.c.TaskLeaderStatusListener - New TaskLeader node for MASTER go-devel.us-east1-b.c.gcptraining-221417.internal:45678 registered itself.
17:24:21.693 [main] INFO  org.eclipse.jetty.server.Server - jetty-9.2.26.v20180806
17:24:21,737 |-INFO in null - Will use configuration resource [/logback-access.xml]
17:24:21,739 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@f805ab0 - URL [jar:file:/home/brucelowther/.m2/repository/com/dremio/dremio-dac-backend/3.1.9-201904051346520183-a35b753/dremio-dac-backend-3.1.9-201904051346520183-a35b753-tests.jar!/logback-access.xml] is not of type file
17:24:21,749 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
17:24:21,749 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [out]
17:24:21,750 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.access.PatternLayoutEncoder] for [encoder] property
17:24:21,783 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [out] to null
17:24:21,784 |-INFO in ch.qos.logback.access.joran.action.ConfigurationAction - End of configuration.
17:24:21,784 |-INFO in ch.qos.logback.access.joran.JoranConfigurator@3c7a94c7 - Registering current configuration as safe fallback point
17:24:21,737 |-INFO in LogbackRequestLog - Will use configuration resource [/logback-access.xml]
17:24:21,739 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@f805ab0 - URL [jar:file:/home/brucelowther/.m2/repository/com/dremio/dremio-dac-backend/3.1.9-201904051346520183-a35b753/dremio-dac-backend-3.1.9-201904051346520183-a35b753-tests.jar!/logback-access.xml] is not of type file
17:24:21,749 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
17:24:21,749 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [out]
17:24:21,750 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.access.PatternLayoutEncoder] for [encoder] property
17:24:21,783 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [out] to null
17:24:21,784 |-INFO in ch.qos.logback.access.joran.action.ConfigurationAction - End of configuration.
17:24:21,784 |-INFO in ch.qos.logback.access.joran.JoranConfigurator@3c7a94c7 - Registering current configuration as safe fallback point

17:24:23.284 [main] INFO  c.d.d.s.TimingApplicationEventListener - ApplicationEventListener.onEvent INITIALIZATION_START
17:24:23.770 [main] INFO  o.h.validator.internal.util.Version - HV000001: Hibernate Validator 5.2.4.Final
17:24:26.598 [main] INFO  c.d.d.s.TimingApplicationEventListener - ApplicationEventListener.onEvent INITIALIZATION_APP_FINISHED
17:24:26.600 [main] INFO  c.d.d.s.TimingApplicationEventListener - ApplicationEventListener.onEvent INITIALIZATION_FINISHED
17:24:26.930 [main] INFO  o.e.j.server.handler.ContextHandler - Started o.e.j.s.ServletContextHandler@142c4dd6{/,null,AVAILABLE}
17:24:26.945 [main] INFO  o.e.jetty.server.ServerConnector - Started ServerConnector@52d21bb7{HTTP/1.1}{0.0.0.0:9047}
17:24:26.945 [main] INFO  org.eclipse.jetty.server.Server - Started @31132ms
17:24:26.945 [main] INFO  com.dremio.dac.server.WebServer - Started on http://localhost:9047
17:24:26.955 [main] INFO  org.eclipse.jetty.server.Server - jetty-9.2.26.v20180806
17:24:26.957 [main] INFO  o.e.jetty.server.ServerConnector - Started ServerConnector@bf43fd{HTTP/1.1}{127.0.0.1:38215}
17:24:26.957 [main] INFO  org.eclipse.jetty.server.Server - Started @31145ms
17:24:26.957 [main] INFO  c.dremio.dac.server.LivenessService - Started liveness service on port 38215
Dremio Daemon Started as master

Running

Ok, now that the server is up, I need to do some Google cloud network configuration magic to open the port for my ip address and only my ip address…

Then I should be able to connect to the running dremio-oss server.

Yes it appears that the server stdout logs are showing my web client access:

...
Dremio Daemon Started as master
###.###.###.234 - - [15/Apr/2019:17:29:51 +0000] "GET / HTTP/1.1" 200 1339 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"
###.###.###.234 - - [15/Apr/2019:17:29:51 +0000] "GET /style.583ea5bcb1404156a493d5295549f5fa.css HTTP/1.1" 200 175625 "http://35.###.###.110:9047/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"
###.###.###.234 - - [15/Apr/2019:17:29:51 +0000] "GET /vendor.0b66021ee879f4992591.js HTTP/1.1" 200 415773 "http://###.###.###.110:9047/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"
###.###.###.234 - - [15/Apr/2019:17:29:51 +0000] "GET /bundle.0b66021ee879f4992591.js HTTP/1.1" 200 1104625 "http://###.###.###.110:9047/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"
###.###.###.234 - - [15/Apr/2019:17:29:51 +0000] "GET /apiv2/socket HTTP/1.1" 403 366 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"

And the web app looks like this at the moment…

dremio-oss up and running

Conclusion

I wouldn’t call this a fix just yet, but a work-around. The real solution would be to have the exec:exec command setup the database if needed, but noop if nothing was needed.

I’m still very early on in learning this server, so I’m sure others will have a better opinion on how to fix this than I.