HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

I was trying ti set up presto with azure gen2 , but facing the below issue when I am querying

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
thegen2withbuttryingissuequeryingfacingazurepresto

Problem

I am using prestodb:0.234

Dependency jars:

azure-storage-2.0.0.jar
   hadoop-azure-3.2.1.jar
   hadoop-azure-datalake-3.2.1.jar 
   azure-data-lake-store-sdk-2.3.8.jar


I am getting below error while doing the querying.

```
com.facebook.presto.spi.PrestoException: cannot create caching file system
at com.facebook.presto.hive.HiveCachingHdfsConfiguration.lambda$getConfiguration$0(HiveCachingHdfsConfiguration.java:70)
at com.facebook.presto.hive.HiveCachingHdfsConfiguration$CachingJobConf.createFileSystem(HiveCachingHdfsConfiguration.java:94)
at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:59)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at com.facebook.presto.hive.HdfsEnvironment.lambda$getFileSystem$0(HdfsEnvironment.java:73)
at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:72)
at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:66)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadPartition(BackgroundHiveSplitLoader.java:298)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:270)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:100)
at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:199)
at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:47)
at com.facebook.presto.hive.util.ResumableTasks.access$000(ResumableTasks.java:20)
at com.facebook.presto.hive.util.ResumableTasks$1.run(ResumableTasks.java:35)
at com.facebook.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorke

Solution

Since ABFS mentioned in the stacktrace as a caching filesystem, then I suppose two possible reasons and suggest solutions for them

Dependency mesh in maven

As noted in two answers (one, two) to a similar question


When we use maven-assembly-plugin, it merges all our JARs into one, and all META-INFO/services/org.apache.hadoop.fs.FileSystem overwrite each-other. Only one of these files remains (the last one that was added). In this case, the FileSystem list from hadoop-commons overwrites the list from hadoop-hdfs, so DistributedFileSystem was no longer declared.

So, check your dependencies and if there are JAR with correct org.apache.hadoop.fs.FileSystem in your distibution.

PrestoDB uses outdated azure-site.xml

There is known issue #11648.
And as said, ABFS, ADLS and ALDS gen2 was only introduced to https://github.com/prestosql/presto, not to prestodb

So, suggestion is to ask your question on Presto Community Slack

P.S. try storediag

There is a hadoop manual on Azure ABFS

And among other tips they recommend:


One useful tool for debugging connectivity is the cloudstore storediag utility.


This validates the classpath, the settings, then tries to work with the filesystem.

Context

StackExchange DevOps Q#11387, answer score: 2

Revisions (0)

No revisions yet.