Special Sort Of Vector

I do not have a name for a SpecialSortOfVector which I need for some coding. Has anyone else met this?

This vector can be of any length, but all the entries must be positive or zero, and they should all add up to exactly one. This represents the fractions of a split, such as when sharing a cake or dividing a flow. It also crops up in the fractional composition of any mixture.

-- JohnFletcher

How about "distribution"? Or maybe "finite distribution" if you want to be more specific.

Interesting. Another way of looking at it is that all points lie on the positive quadrant of a unit hypersphere. I am attempting to implement a class in CeePlusPlus which will model this. -- JohnFletcher

I would use the polar representation, knowing that the vector length is always one and than the angles are <= pi/2. Then, you should not care about the relation between the components (the vector length must be one by definition).

Polar representation in big dimensions is really icky. It's bad enough in three dimensions with two angles that behave differently. I would keep them as vectors along with a magnitude, and have a normalization procedure applied only when essential. This reduces rounding errors.

Given that it adds up to 1, this can be thought of as a finite probability distribution. When implementing a class for it, will you have lots all of the same length, or will you have lots of varying lengths? The question comes down to - if these are objects, what do they do? Are they mutable? How do they combine? In truth, what are they? If they are proportions in a mixture, it's not really a vector.

I am working on a model of a chemical process. The first place where this object occurs is as the parameters for a plant unit which splits a flow. The parameters are the fraction of the flow which goes into the different outlets. The other place is with chemical composition on either a mass or molar basis. In either case this is often used in the form of mass or mole fractions, which have to add to one. This means that adding some more of any chemical changes all the fractions. The parameters are specific to the unit, but each distinct chemical phase, gas or liquid, will have an object to describe its composition. On the question of length, for the unit that is the number of outlets, and there will be one object at any one time. For the composition it is the number of distinct chemical species in the problem, and they will be an object the same length for all process streams in the same section of the process. -- JohnFletcher

So the first place can be thought of as a probability for a given particle following a given path.

Your two examples actually look like different (though related) cases.

I had one thought about this, but I don't know how useful it will be. If you represent the parts as floating point numbers that add to 1, there is room for error, as they may not always add to exactly 1. Instead, you could represent them as fractions. There is a common denominator across all of the parts, and each part gets its own numerator. The numerators add up to the common denominator, so there is no need to store the denominator. That way, the data can never be in an invalid state. Whether this is easier to work with depends on what sort of operations you need to perform on the data. -- MichaelSparks

See also ObserverPatternInCeePlusPlus for implementation of links between units.


I think the [??] adds up to one, and the divisions hides the fact that you're applying the ConservationOfMass? laws - starting with some kilograms, and following them through the processes. -- ChrisGarrod


Example code

  // ====================================================================
  // APZAOne Class ( All Positive or Zero Add to One )
  // Means to ensure that a vector has all positive or zero entries and adds to one.
  // (APZAOne). This is needed for composition vectors and also for split factors.
  // ====================================================================
  // I have now set this up so that the entries actually entered are the ones stored.
  // [i] will return the raw data.
  // (i) will return the data normalized to (0 - 1).
  // I store the inverse of sum_ which saves some divisions.

template <class T> class APZAOne : public std::vector<T> { typedef std::vector<T> array_type; T sum_; T inverse_; bool checked_; public: APZAOne() : sum_(T(0)), inverse_(T(0)), checked_(false) { } // Needs other constructors. // Need to recalculate if a value changes. T sum() { sum_ = std::accumulate(this->begin(), this->end(),0.0); if (sum_ > T(0)) inverse_ = 1./sum_; return sum_; } // Zeroes any negative values. T check() { if (checked_) return sum_; for (int i = 0; i < this->size(); i++) { if ( (*this)[i] < T(0) ) (*this)[i] = T(0); } checked_ = true; return sum(); } T operator[](int i) const { return array_type::operator[](i) ; } T &operator[](int i) { checked_=false; return array_type::operator[](i) ; } T operator()(int i) { //if (!checked_) check(); violates constness. if (i >= 0 && i < this->size() && sum_ > 0 ) return ((*this)[i])*inverse_; else return T(0); } // Only push back a positive or zero value. T push_back(T t) { if (t >= T(0) ) { array_type::push_back(t); return sum(); } else return T(0); } };

Thanks to all who have made suggestions. Here is some sample code which does it something along the lines which ChrisGarrod suggested. I decided to store the original numbers, as these will be needed. One problem with the code is that it is valid for the outside code to assign using operator[] and doing so will not force a check of the data. The best I can do is set checked_ to false to indicate the possible need.

-- JohnFletcher


MarchZeroSix

CategoryMath CategoryCpp


EditText of this page (last edited April 12, 2007) or FindPage with title or text search