3 Star 2 Fork 1

科学大数据开源社区 / 多元数据库查询系统-simba

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.md 1.11 KB
一键复制 编辑 原始数据 按行查看 历史
LEEXYZABC 提交于 2017-02-28 18:47 . rdd support and scalaTests verification

simba

insert, extraction and analysis framework for LDM

#Notice 1: scala version should be compatible for the system and the Spark

  1. spark 1.3.1
  2. scala 2.10.4
  3. hadoop 1.2.1
  4. titan 1.0.0

#Notice 2: assume lib in simba home contains following libs hadoop-client-1.2.1.jar
hadoop-gremlin-3.0.1-incubating.jar
hbase-common-0.98.2-hadoop1.jar
htrace-core-2.04.jar hadoop-core-1.2.1.jar
hbase-client-0.98.2-hadoop1.jar
hbase-protocol-0.98.2-hadoop1.jar or you need to include these libs through modifying the build.sbt

#Notice 3: (for titan)

  1. conf contains "conf/titan-hbase-es-simba.properties" configuration file for TitanDB(hbase+es in default)
  2. test_input contains the docs and links data and can be accessed as val docRDD = sc.objectFileDocument val linkRDD = sc.objectFileDocumentLink

compile####

sbt clean compile

run

sbt run

test

sbt test

#Simple Example: var gDB = TitanSimbaDB(sc, titanConf) val docRDD = sc.objectFileDocument gDB.insert(docRDD) gDB.docs().foreach(s => s.simbaPrint()) gDB.close()

1
https://gitee.com/opensci/simba.git
git@gitee.com:opensci/simba.git
opensci
simba
多元数据库查询系统-simba
master

搜索帮助