java 如何将列值从字符串转换为十进制?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40225485/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 05:00:48  来源:igfitidea点击:

How to convert column values from string to decimal?

javaapache-sparkapache-spark-sql

提问by Igor Kustov

I'm having a dataframe which contains a really big integer value, example:

我有一个包含非常大的整数值的数据框,例如:

42306810747081022358

When I've tried to convert it to long it was working in the Java but not under the spark envrironment, I was getting

当我尝试将其转换为 long 时,它在 Java 中运行但不在 spark 环境下运行,我得到了

   NumberFormatException: For input string("42306810747081022358")

Then I tried to convert it too Decimal (BigDecimal) value. Again, easily can do it in Java, but in Spark: dframe.withColumn("c_number",col("c_a").cast(new DecimalType()));

然后我尝试将其转换为 Decimal (BigDecimal) 值。同样,在 Java 中很容易做到,但在 Spark 中: dframe.withColumn("c_number",col("c_a").cast(new DecimalType()));

This way I don't get any exceptions, however I can see that all result values are null.

这样我就不会出现任何异常,但是我可以看到所有结果值都为空。

I also tried to use UDF for this purpose but get the same results:

为此,我也尝试使用 UDF,但得到了相同的结果:

UDF1 cTransformer = new UDF1<String, BigDecimal>() {
        @Override
        public BigDecimal call(String aString) throws Exception {
            return new BigDecimal(aString);
        }
    };
sqlContext.udf().register("cTransformer", cTransformer, new DecimalType());
dframe = dframe.withColumn("c_number", callUDF("cTransformer", dframe.col("c_a"))); 

And here again all I'm getting is a column with all zeroes.

在这里,我得到的只是一列全为零。

How should I proceed?

我应该如何进行?

回答by

Try:

尝试:

dframe.withColumn("c_number", dframe.col("c_a").cast("decimal(38,0)"))

回答by Fabich

A Decimalhas a precisionand scalevalue, by default the precision is 10 and scale is 0.
The precision is the maximum number of digit in your number. In your case you have more than 10 digits so the number can't be cast to a 10 digits Decimal and you have null values.

一个小数具有精度刻度值,默认情况下,精度为10和规模为0
的精确度是决定你的电话号码位数的最大数量。在您的情况下,您有超过 10 位数字,因此该数字无法转换为 10 位小数,并且您有空值。

To avoid that you need to specify a precision large enough to represent your numbers :

为避免这种情况,您需要指定一个足够大的精度来表示您的数字:

dframe.withColumn("c_number", dframe.col("c_a").cast(new DecimalType(38,0)))

Note that the precision can be up to 38

注意精度最高可达38

回答by Sesha

This is after you have data in data frame with column that needs to be converted is ready Try: dframe.select($"column_name".cast("decimal(9,2)"))

这是在数据框中的数据与需要转换的列准备就绪之后尝试: dframe.select($"column_name".cast("decimal(9,2)"))

回答by Protyush Ghosh

In scala :

在斯卡拉:

df=df.withColumn("col", $"col".cast(DecimalType(9,2)))