com.yahoo.ycsb.generator.ZipfianGenerator类的使用及代码示例

x33g5p2x  于2022-02-05 转载在 其他  
字(12.1k)|赞(0)|评价(0)|浏览(97)

本文整理了Java中com.yahoo.ycsb.generator.ZipfianGenerator类的一些代码示例,展示了ZipfianGenerator类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。ZipfianGenerator类的具体详情如下:
包路径:com.yahoo.ycsb.generator.ZipfianGenerator
类名称:ZipfianGenerator

ZipfianGenerator介绍

[英]A generator of a zipfian distribution. It produces a sequence of items, such that some items are more popular than others, according to a zipfian distribution. When you construct an instance of this class, you specify the number of items in the set to draw from, either by specifying an itemcount (so that the sequence is of items from 0 to itemcount-1) or by specifying a min and a max (so that the sequence is of items from min to max inclusive). After you construct the instance, you can change the number of items by calling nextInt(itemcount) or nextLong(itemcount). Note that the popular items will be clustered together, e.g. item 0 is the most popular, item 1 the second most popular, and so on (or min is the most popular, min+1 the next most popular, etc.) If you don't want this clustering, and instead want the popular items scattered throughout the item space, then use ScrambledZipfianGenerator instead. Be aware: initializing this generator may take a long time if there are lots of items to choose from (e.g. over a minute for 100 million objects). This is because certain mathematical values need to be computed to properly generate a zipfian skew, and one of those values (zeta) is a sum sequence from 1 to n, where n is the itemcount. Note that if you increase the number of items in the set, we can compute a new zeta incrementally, so it should be fast unless you have added millions of items. However, if you decrease the number of items, we recompute zeta from scratch, so this can take a long time. The algorithm used here is from "Quickly Generating Billion-Record Synthetic Databases", Jim Gray et al, SIGMOD 1994.
[中]齐夫分布的发生器。它产生了一系列的物品,根据齐夫分布,一些物品比其他物品更受欢迎。构造此类的实例时,可以通过指定itemcount(使序列中的项目从0到itemcount-1)或指定min和max(使序列中的项目从最小到最大,包括最小到最大)来指定要从中绘制的集合中的项目数。构造实例后,可以通过调用nextInt(itemcount)或nextLong(itemcount)来更改项目数。请注意,受欢迎的项目将被聚集在一起,例如,项目0是最受欢迎的,项目1是第二受欢迎的,等等(或者min是最受欢迎的,min+1是第二受欢迎的,等等)。如果您不希望进行这种聚集,而是希望受欢迎的项目分散在整个项目空间中,那么请改用ScrambledZipfianGenerator。请注意:如果有很多项可供选择(例如,1亿个对象超过一分钟),则初始化此生成器可能需要很长时间。这是因为需要计算某些数学值才能正确地生成齐普夫偏移,其中一个值(zeta)是从1到n的和序列,其中n是itemcount。请注意,如果你增加集合中的项目数,我们可以增量计算一个新的zeta,因此它应该很快,除非你添加了数百万个项目。然而,如果你减少物品的数量,我们会从头开始重新计算zeta,所以这可能需要很长时间。这里使用的算法来自“快速生成十亿条记录的合成数据库”,Jim Gray等人,SIGMOD 1994。

代码示例

代码示例来源:origin: brianfrankcooper/YCSB

public static void main(String[] args) {
 new ZipfianGenerator(ScrambledZipfianGenerator.ITEM_COUNT);
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Return the next value, skewed by the Zipfian distribution. The 0th item will be the most popular, followed by
 * the 1st, followed by the 2nd, etc. (Or, if min != 0, the min-th item is the most popular, the min+1th item the
 * next most popular, etc.) If you want the popular items scattered throughout the item space, use
 * ScrambledZipfianGenerator instead.
 */
@Override
public Long nextValue() {
 return nextLong(items);
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Create a zipfian generator for items between min and max (inclusive) for the specified zipfian constant, using
 * the precomputed value of zeta.
 *
 * @param min The smallest integer to generate in the sequence.
 * @param max The largest integer to generate in the sequence.
 * @param zipfianconstant The zipfian constant to use.
 * @param zetan The precomputed zeta constant.
 */
public ZipfianGenerator(long min, long max, double zipfianconstant, double zetan) {
 items = max - min + 1;
 base = min;
 this.zipfianconstant = zipfianconstant;
 theta = this.zipfianconstant;
 zeta2theta = zeta(2, theta);
 
 alpha = 1.0 / (1.0 - theta);
 this.zetan = zetan;
 countforzeta = items;
 eta = (1 - Math.pow(2.0 / items, 1 - theta)) / (1 - zeta2theta / this.zetan);
 nextValue();
}

代码示例来源:origin: brianfrankcooper/YCSB

zetan = zeta(countforzeta, itemcount, theta, zetan);
   eta = (1 - Math.pow(2.0 / items, 1 - theta)) / (1 - zeta2theta / zetan);
  } else if ((itemcount < countforzeta) && (allowitemcountdecrease)) {
     "(itemcount=" + itemcount + " countforzeta=" + countforzeta + ")");
   zetan = zeta(itemcount, theta);
   eta = (1 - Math.pow(2.0 / items, 1 - theta)) / (1 - zeta2theta / zetan);
setLastValue(ret);
return ret;

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Create a zipfian generator for items between min and max (inclusive) for the specified zipfian constant.
 * @param min The smallest integer to generate in the sequence.
 * @param max The largest integer to generate in the sequence.
 * @param zipfianconstant The zipfian constant to use.
 */
public ZipfianGenerator(long min, long max, double zipfianconstant) {
 this(min, max, zipfianconstant, zetastatic(max - min + 1, zipfianconstant));
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Return the next long in the sequence.
 */
@Override
public Long nextValue() {
 long ret = gen.nextValue();
 ret = min + Utils.fnvhash64(ret) % itemcount;
 setLastValue(ret);
 return ret;
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Compute the zeta constant needed for the distribution. Do this from scratch for a distribution with n items,
 * using the zipfian constant thetaVal. Remember the value of n, so if we change the itemcount, we can recompute zeta.
 *
 * @param n The number of items to compute zeta over.
 * @param thetaVal The zipfian constant.
 */
double zeta(long n, double thetaVal) {
 countforzeta = n;
 return zetastatic(n, thetaVal);
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Create a zipfian generator for items between min and max (inclusive) for the specified zipfian constant. If you
 * use a zipfian constant other than 0.99, this will take a long time to complete because we need to recompute zeta.
 *
 * @param min             The smallest integer to generate in the sequence.
 * @param max             The largest integer to generate in the sequence.
 * @param zipfianconstant The zipfian constant to use.
 */
public ScrambledZipfianGenerator(long min, long max, double zipfianconstant) {
 this.min = min;
 this.max = max;
 itemcount = this.max - this.min + 1;
 if (zipfianconstant == USED_ZIPFIAN_CONSTANT) {
  gen = new ZipfianGenerator(0, ITEM_COUNT, zipfianconstant, ZETAN);
 } else {
  gen = new ZipfianGenerator(0, ITEM_COUNT, zipfianconstant);
 }
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Compute the zeta constant needed for the distribution. Do this from scratch for a distribution with n items,
 * using the zipfian constant theta. This is a static version of the function which will not remember n.
 * @param n The number of items to compute zeta over.
 * @param theta The zipfian constant.
 */
static double zetastatic(long n, double theta) {
 return zetastatic(0, n, theta, 0);
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Generate the next string in the distribution, skewed Zipfian favoring the items most recently returned by
 * the basis generator.
 */
@Override
public Long nextValue() {
 long max = basis.lastValue();
 long next = max - zipfian.nextLong(max);
 setLastValue(next);
 return next;
}

代码示例来源:origin: ben-manes/caffeine

/**
 * Returns a sequence of events where some items are more popular than others, according to a
 * zipfian distribution.
 *
 * @param items the number of items in the distribution
 * @param constant the skew factor for the distribution
 * @param events the number of events in the distribution
 */
public static LongStream zipfian(int items, double constant, int events) {
 return generate(new ZipfianGenerator(items, constant), events);
}

代码示例来源:origin: brianfrankcooper/YCSB

/**
 * Compute the zeta constant needed for the distribution. Do this incrementally for a distribution that
 * has n items now but used to have st items. Use the zipfian constant thetaVal. Remember the new value of
 * n so that if we change the itemcount, we'll know to recompute zeta.
 *
 * @param st The number of items used to compute the last initialsum
 * @param n The number of items to compute zeta over.
 * @param thetaVal The zipfian constant.
 * @param initialsum The value of zeta we are computing incrementally from.
 */
double zeta(long st, long n, double thetaVal, double initialsum) {
 countforzeta = n;
 return zetastatic(st, n, thetaVal, initialsum);
}

代码示例来源:origin: brianfrankcooper/YCSB

protected static NumberGenerator getFieldLengthGenerator(Properties p) throws WorkloadException {
 // Re-using CoreWorkload method. 
 NumberGenerator fieldLengthGenerator = CoreWorkload.getFieldLengthGenerator(p);
 String fieldlengthdistribution = p.getProperty(FIELD_LENGTH_DISTRIBUTION_PROPERTY,
   FIELD_LENGTH_DISTRIBUTION_PROPERTY_DEFAULT);
 // Needs special handling for Zipfian distribution for variable Zipf Constant.
 if (fieldlengthdistribution.compareTo("zipfian") == 0) {
  int fieldlength = Integer.parseInt(p.getProperty(FIELD_LENGTH_PROPERTY, FIELD_LENGTH_PROPERTY_DEFAULT));
  double insertsizezipfconstant = Double
    .parseDouble(p.getProperty(INSERT_SIZE_ZIPFIAN_CONSTANT, INSERT_SIZE_ZIPFIAN_CONSTANT_DEAFULT));
  fieldLengthGenerator = new ZipfianGenerator(1, fieldlength, insertsizezipfconstant);
 }
 return fieldLengthGenerator;
}

代码示例来源:origin: brianfrankcooper/YCSB

public static void main(String[] args) {
 double newzetan = ZipfianGenerator.zetastatic(ITEM_COUNT, ZipfianGenerator.ZIPFIAN_CONSTANT);
 System.out.println("zetan: " + newzetan);
 System.exit(0);
 ScrambledZipfianGenerator gen = new ScrambledZipfianGenerator(10000);
 for (int i = 0; i < 1000000; i++) {
  System.out.println("" + gen.nextValue());
 }
}

代码示例来源:origin: brianfrankcooper/YCSB

public SkewedLatestGenerator(CounterGenerator basis) {
 this.basis = basis;
 zipfian = new ZipfianGenerator(this.basis.lastValue());
 nextValue();
}

代码示例来源:origin: brianfrankcooper/YCSB

protected static NumberGenerator getFieldLengthGenerator(Properties p) throws WorkloadException {
 NumberGenerator fieldlengthgenerator;
 String fieldlengthdistribution = p.getProperty(
   FIELD_LENGTH_DISTRIBUTION_PROPERTY, FIELD_LENGTH_DISTRIBUTION_PROPERTY_DEFAULT);
 int fieldlength =
   Integer.parseInt(p.getProperty(FIELD_LENGTH_PROPERTY, FIELD_LENGTH_PROPERTY_DEFAULT));
 int minfieldlength =
   Integer.parseInt(p.getProperty(MIN_FIELD_LENGTH_PROPERTY, MIN_FIELD_LENGTH_PROPERTY_DEFAULT));
 String fieldlengthhistogram = p.getProperty(
   FIELD_LENGTH_HISTOGRAM_FILE_PROPERTY, FIELD_LENGTH_HISTOGRAM_FILE_PROPERTY_DEFAULT);
 if (fieldlengthdistribution.compareTo("constant") == 0) {
  fieldlengthgenerator = new ConstantIntegerGenerator(fieldlength);
 } else if (fieldlengthdistribution.compareTo("uniform") == 0) {
  fieldlengthgenerator = new UniformLongGenerator(minfieldlength, fieldlength);
 } else if (fieldlengthdistribution.compareTo("zipfian") == 0) {
  fieldlengthgenerator = new ZipfianGenerator(minfieldlength, fieldlength);
 } else if (fieldlengthdistribution.compareTo("histogram") == 0) {
  try {
   fieldlengthgenerator = new HistogramGenerator(fieldlengthhistogram);
  } catch (IOException e) {
   throw new WorkloadException(
     "Couldn't read field length histogram file: " + fieldlengthhistogram, e);
  }
 } else {
  throw new WorkloadException(
    "Unknown field length distribution \"" + fieldlengthdistribution + "\"");
 }
 return fieldlengthgenerator;
}

代码示例来源:origin: brianfrankcooper/YCSB

private static NumberGenerator getKeyChooser(String requestDistrib, int recordCount, double zipfContant,
                       Properties p) throws WorkloadException {
 NumberGenerator keychooser;
 switch (requestDistrib) {
 case "exponential":
  double percentile = Double.parseDouble(p.getProperty(ExponentialGenerator.EXPONENTIAL_PERCENTILE_PROPERTY,
    ExponentialGenerator.EXPONENTIAL_PERCENTILE_DEFAULT));
  double frac = Double.parseDouble(p.getProperty(ExponentialGenerator.EXPONENTIAL_FRAC_PROPERTY,
    ExponentialGenerator.EXPONENTIAL_FRAC_DEFAULT));
  keychooser = new ExponentialGenerator(percentile, recordCount * frac);
  break;
 case "uniform":
  keychooser = new UniformLongGenerator(0, recordCount - 1);
  break;
 case "zipfian":
  keychooser = new ZipfianGenerator(recordCount, zipfContant);
  break;
 case "latest":
  throw new WorkloadException("Latest request distribution is not supported for RestWorkload.");
 case "hotspot":
  double hotsetfraction = Double.parseDouble(p.getProperty(HOTSPOT_DATA_FRACTION, HOTSPOT_DATA_FRACTION_DEFAULT));
  double hotopnfraction = Double.parseDouble(p.getProperty(HOTSPOT_OPN_FRACTION, HOTSPOT_OPN_FRACTION_DEFAULT));
  keychooser = new HotspotIntegerGenerator(0, recordCount - 1, hotsetfraction, hotopnfraction);
  break;
 default:
  throw new WorkloadException("Unknown request distribution \"" + requestDistrib + "\"");
 }
 return keychooser;
}

代码示例来源:origin: brianfrankcooper/YCSB

scanlength = new UniformLongGenerator(minscanlength, maxscanlength);
} else if (scanlengthdistrib.compareTo("zipfian") == 0) {
 scanlength = new ZipfianGenerator(minscanlength, maxscanlength);
} else {
 throw new WorkloadException(

代码示例来源:origin: brianfrankcooper/YCSB

scanlength = new UniformLongGenerator(1, maxscanlength);
} else if (scanlengthdistrib.compareTo("zipfian") == 0) {
 scanlength = new ZipfianGenerator(1, maxscanlength);
} else {
 throw new WorkloadException(

代码示例来源:origin: com.github.ben-manes.caffeine/simulator

/**
 * Returns a sequence of events where some items are more popular than others, according to a
 * zipfian distribution.
 *
 * @param items the number of items in the distribution
 * @param events the number of events in the distribution
 */
public static LongStream zipfian(int items, int events) {
 return generate(new ZipfianGenerator(items), events);
}

相关文章

ZipfianGenerator类方法