Archive for the ‘Joe Litko’ Category

Heuristics and Optimization

July 12th, 2011 2:20 pm Category: Joe Litko, Optimization, by: Joe Litko

Heuristics and Optimization

 You might think the title should be ‘Heuristics or Optimization’, implying a choice.  But often the two approaches work well together with heuristics speeding an optimization process.  The Wikipedia definition of heuristic calls it an experienced-based technique for problem solving, learning, and discovery.  Wikipedia also mentions using heuristics to find a good enough solution and describes them as ‘strategies using readily accessible, though loosely applicable, information to control problem solving.’

 Those descriptions do not emphasize another aspect of heuristics – there is generally an underlying concept that informs the heuristic.  There is a good reason why we think it will work well in the majority of cases.  For example, an angle sweep heuristic is often used when designing routes for pickup and delivery from a central hub.  Those routes are candidates for selection in a formal optimization.  The designed routes look a lot like the petals of a daisy. 

 The heuristic starts out by heading north and picking locations close to that direction on the way out and back.  How far out the route goes is a property of vehicle capacity or time limitations.  The next route to be generated starts out slightly east of north and follows the same limitations and usually overlaps many of the locations on the first route.  Once the entire compass has been swept, the best set of routes to cover all locations is selected by an optimization.  In the example the heuristic becomes a front end for the optimization. 

 Another example comes from a driver scheduling problem.  Suppose a set of drivers must pick up some commodity from a set of locations for processing at a central plant.  Each trip in this example is an out-and-back because of the nature of the commodity, i.e. only one location can be visited.  Drivers pick up multiple loads in a day, and each location requires multiple visits.  The pickup times are fixed because of other problem features.  One approach is to simply allow all combination of driver-load-location pairings and let an optimizer grind away. 

 But there are other desirable features of the solution:  equalizing number of loads among drivers, and keeping driver dead time between loads to a minimum.  Specifying all the driver loads by some simple heuristic, e.g. send a driver out for the next load as soon as possible, usually ends up with some loads that cannot be covered.  A totally greedy approach fails.

 An approach that seems to work well in this case is to consider some drivers for early loads and some for late loads.  Work from the front of the early loads assigning each of the early drivers the first two loads they can feasibly complete.  Then work from the end of the late loads assigning the last two loads of the late drivers as the last two loads they can feasibly complete. 

 The loads in the middle and the drivers that are not considered early or late are handled by the optimization.  Notice that the heuristic does well on the driver gaps and guarantees that most drivers automatically get two loads, which is a good base in this application.  It also serves to speed the optimization by reducing the pairings to be searched while preserving enough flexibility to get a solution. 

 Furthermore, the heuristic is flexible in that one can choose how many drivers to consider early or late and how many of their loads to nail down heuristically.  Most importantly the heuristic gets better solutions than the optimization finds in any reasonable time.  So while the optimization must have that best solution out there, it will not find it in the time frame the scheduler has to work within.

 My experience is that flexibility is one of the key properties in any good heuristic.  Adaptability to new situations is also a feature of good heuristics.  One final example illustrating adaptability is based on an algorithm called ‘Toyoda’s Algorithm’.  I have applied this particular idea to a number of situations.

 In this example it is applied to sequencing the unloading of containers at a port.  Each container holds a selection of parts which have to be processed by various work centers prior to shipment out of the port.  A manifest shows the selection of the parts and it is known how much work is associated with a given part at a work center.  Not every work center can process every part type.  The objective is to get all the work centers to end their day at the same time and to keep all the work centers busy throughout the day.

 The approach is easy to understand in two dimensions.  The X and Y axis represent the available work time of two work centers, e.g. eight hours.  The arrows represent the amount of work delivered to a work center by a given container.  The dashed line is the ‘ideal path’ — equal amount of work at each work center throughout the day.

  

 The heuristic simply needs to loop through all available containers at each iteration and always try to get back onto the ideal path.  Penalties for deviation are totally flexible in that small deviations can be without penalty while sizable ones are some function that penalizes them heavily.  Other problem features can be captured, e.g., buffer space at a work center, and incorporated in the penalty function.  This is not a formal optimization, but it is speedy and good enough for the real world application.

 The watchwords seem to be these.  Look for the important features of a good solution.  See if a simple rule or concept will drive the solution toward these good features.  This is especially true when there is little or no economic benefit to the optimal solution.  Try to develop a heuristic that is flexible in adapting to normal variance in instances of the data and can be tuned to choose between competing objectives.

Cluster Analysis in Supply Chain Optimization

November 2nd, 2010 8:28 am Category: Joe Litko, Optimization, by: Joe Litko

Cluster Analysis in Supply Chain Optimization

Cluster Analysis (CA) is a versatile tool for grouping objects based on their similarity to each other.  It is applied in a wide variety of situations.  Some example applications are similarity of faces, countries, and species of dogs.  One starts by defining attributes of the individuals (countries, faces, species…) that can be assigned a numerical value, e.g. length of nose, separation of eyes.  One of several available metrics is used to define the ‘distance’ between the individuals.   Individuals are then assigned to a cluster based on the shortest distance.  CA produces some appealing results and gives useful insights that spark further investigation even in the exotic applications.

In the relatively straightforward context of grouping locations, cluster analysis performs equally well.  This makes CA a useful tool in supply chain optimization.  Designing a distribution network often involves planning of routes over regions or deciding on locations for warehouses.  One could look at a map covered with dots representing locations and draw circles around groups.  But this would be an arbitrary approach.  CA offers a way to group locations in a systematic way and speeds up the process of exploring several different versions of the clusters.  I am going to present a few examples and describe the mechanics of the CA process.

In the examples that follow we have a set of 400+ locations from an area surrounding Denver Colorado.  The latitude and longitude of the locations are the attributes that will define the similarity of locations.  In this case the distance between objects and clusters really is a distance.  In CA you can generally use Euclidean distance or XY distance – as though you were on a sidewalk in a city.  Which to use depends on the situation.  If looking at a downtown area, the XY distance might be best.  But over a broader area the Euclidean or Great Circle distance makes more sense.

There are really two broad categories of clustering – hierarchical and non-hierarchical.  The hierarchical methods begin by considering all the individuals (locations) as separate groups.  The nearest two are joined and now we have one less group.  The location of that group can be defined in a few different ways.  For now picture it as the centroid of the group members – which could be weighted by something like demand for a product.  The next join depends on the remaining distances between members or between a member and the existing clusters.  In either case we end up with one less group.  CA keeps joining until there is one large group.  This is agglomerative or hierarchical clustering.

Along the way there is a point at which we have 434, 433… 5, 4, 3, 2, 1 groups.  For any given number of groups we can look at the distance between the groups and the distance of group members from the centroid of their own group.  One could look for a small number of clusters where the distances within a cluster are small compared to the distances between clusters. 

Most of the options within hierarchical clustering are based on the way we define the distance between an individual location and a cluster and the distance from one cluster to another.  One example is nearest neighbor.  In this case the distance of an individual to a cluster is the distance to the closest existing location in the cluster.  Farthest neighbor is also a possibility.  Neither of these is well-suited to most supply chain applications.  The typical method is to use the centroid as the cluster location for defining distances.  Let’s take a look at three methods, average distance, farthest neighbor, and Ward’s method (which will be described soon).

Figure 1 – Clustering based on farthest neighbor (Complete Clustering)

In the complete method an individual or cluster is joined to another cluster based on the smallest maximum distance between any two individual locations.  Figure 1 above is what it produces.  If the groups were to be used to define clusters for delivery routes one problem might be the unequal sizes of the five groups that were created.

The average distance method joins individuals or clusters to another cluster based on the smallest average distance to the members of a cluster.  This gives another result with unequal cluster membership.  A method based on the median distance gives a result very similar to the complete method using the sample data, though it is less sensitive to outlying locations that are far from all others.

If reasonably equal group sizes are important, then another method called Ward’s method is one of the best choices.  The results using Ward’s method are shown below.  Ward’s method makes assignments that minimize the within clusters deviations from the centroid of the cluster.  What you get depends on how many clusters you ask for.  You can compare the results of five and six clusters.

Figure 2 – Clustering based on average distance to the existing cluster members.

 

Figure 3 – Ward’s Method with five clusters requested.

 

Figure 4 – Ward’s Method with six clusters requested.

Another way to look at the results is with a type of graph called a dendogram.  The dendogram gives a visual look at the way similarity of clusters changes as individuals are merged into the clusters.  The dendogram is shown in Figure 5 below.  The vertical scale is just a measure of similarity.  It is linear and is often just a normalized scale from zero to one hundred. 

What you take away from the dendogram is this:  as fewer clusters are formed, the similarity of members within a cluster decreases.  Picture a horizontal line that cuts across the figure where it will intersect just six of the vertical dendogram lines.  Notice that with just a small change in the vertical scale we could cross four or five vertical lines of the dendogram.  That implies there is not much change going from four to six clusters in terms of the nearness of individuals within the clusters.  They are still very similar (close).  Things change much more rapidly going below four clusters.  Of course, at the bottom when there are two hundred clusters, members of each cluster are very similar (read close) to each other.

The last method to look at is different than hierarchical clustering.  It is called K Means clustering.  It is not a hierarchical method so a dendogram is not really possible – joining does not occur sequentially.  Without too many details, K Means clustering works like this.  We start with an initial division into a certain number of clusters – let’s say six.  The initial division should be done in some sensible way, e.g. Ward’s method.

Figure 5 – Dendogram based on Ward’s method.

K-Means clustering keeps examining all the observations to see if they are closer to the centroid of some group than to the group they are currently in.  If so, they are moved and the centroids of both affected groups are recalculated.  The method continues until no more improving moves can be made.  It is important to start with a decent initial grouping.  The result of K-Means clustering using Ward’s method as a starting point is shown in Figure 6.

Figure 6 – K Means Clustering with an initial group.

If we had started without an initial group the result would be a bit different.  It would look like Figure 7.

Figure 7 – K Means Clustering without an initial group.

This is somewhat different than the result with an initial group. 

Bottom line recommendations are these.  Use K-Means clustering with Ward’s method as an initial seed.  If K-Means clustering is not available, use Ward’s method.  If you use a hierarchical method take a look at the dendogram as a way to decide on a number of clusters.  You can reduce the number of clusters until the reduction of similarity is large.  Looking at the scattergram of the points for various numbers of clusters should confirm the dendogram information.  Methods like the farthest or nearest neighbor will very likely give poor results.  All methods are affected by outliers to some degree.  Consider managing a few far out points by hand.  Minitab is a very good software product for doing cluster analysis.

CA is far from a mathematical curiosity.  It can be a very useful tool for network or facility location analysis.  It certainly lets the analyst explore more solutions than could be done manually.  Rather than some arbitrary decisions on grouping, CA can contribute the analysis that leads to bottom line savings.

Focusing Clients on Solution Features

April 29th, 2010 11:54 pm Category: Joe Litko, Scheduling, by: Joe Litko

When using a rapid iterative development process, a client may see several solutions to his problem that are not perfect.  Getting the best feedback from the client at every cycle of the development is important.  If you get only the most obvious evaluations, the progress towards an acceptable product is erratic, possibly unsuccessful, and very painful.

Recent experience was a scheduling application where we did not have direct contact with the final user.  Here we were working for an intermediate third party.  It finally became clear that feedback was going to be limited to a comment about the feature the customer wanted to focus on.  Usually that would represent some aspect of the solution that was either unacceptable or very different from current practice.  Some current practices would require new resources and changes in parts of the system that could not be easily changed. Read the rest of this entry »

When I worked as a Branch Chief in Air Mobility Command’s Analysis Group my boss often talked about nailing the Jell-O to the wall. I took that to mean we had a project with some ill-defined performance measures and objectives. Our first task was going to be establishing the goals. He was a crusty old colonel who was usually dead right on these matters.

The phrase stuck with me, but its meaning has probably evolved a bit. Working with our clients we frequently create optimization models where the objective is a mixture of terms – some that are concrete and some that are ill-defined. The user has certain costs that are very real but also recognizes that a good solution has other desirable qualities. Sometimes these additional criteria are easily expressed as constraints. But just as often they have to be weighed against the concrete and known cost terms in the objective function.

As an example, we have a client that must pay to move heavy equipment around the country from one customer site to another. The model output is a schedule of the required movements for each piece of machinery. Any given solution will have values for a few performance measures. The cost per mile of moving the equipment is easily measured and is known. Although this cost might vary during the year and is subject to some uncertainty, it is still well-defined. We can get solutions for different values of this cost parameter when we do sensitivity analysis. This is the most directly and easily quantified performance measure for a solution.

On the other hand, there are some other factors that enter into the objectives and are harder to value. For instance, risk is a factor we add to the model to give some latitude in meeting deadlines for the arrival/departure of the equipment to/from locations where it will service a user contract. Clearly, we want a solution with little risk. Upgrades are another such factor. At times, equipment of higher quality than contracted for must be used to meet contract dates. Again, this is something to be avoided since there is wear and tear on the expensive machinery. Finally, there can be occasions when an appropriate configuration of equipment is just not available. Meeting the contract requires leasing equipment at substantial cost.

None of the additional factors is well-expressed in a constraint since there are no absolute limits. The model we use influences these performance measures via the cost terms in the objective function we are trying to minimize. The trick is to assign the right costs to get solutions that the user can recognize as good. In cases like these, Experimental Design applied to the optimization model is an efficient way to derive a useful set of parameters.

Experimental Design (DOE)
Design of Experiments (DOE) has long been used in industrial settings to quantify the way parameters affect product design. Taguchi methods and Robust Engineering are the most popular names for this approach. Inferior methods based on intuition or one-factor-at-a-time experimentation have been thoroughly discredited and largely driven out of practice by DOE approaches.

Applying DOE to optimization and simulation modeling is not a recent development, either. In the machinery example, the solution (a schedule with routes) is essentially a product we are designing with the help of the model. In the typical application DOE is used to summarize the behavior of the model with respect to resource allocation. Essentially, we see how much benefit is derived from the addition of resources. In this case we use DOE to see how the shape of the solution changes as costs are varied. The costs represent the relative importance of the various aspects of the solution. Some are concrete others really are not.

Returning to the machinery movement example you can probably picture what happens with some extreme values of the costs. If substitution and upgrades are free, then you get a solution with a small number of miles. But you find customers who paid for low-tech equipment getting a lot of free upgrades. This is not desirable for other reasons besides wear-and-tear. Those customers that paid for the best can feel abused and it hardly motivates the lucky customer to pay full price for the best equipment next time around.

Also, an overly conservative approach to meeting contract deadlines means that the company must buy more equipment than it really needs. So a very high penalty for taking risk, e.g. days allowed to reposition equipment, can be very expensive. The start and end dates for the contracts shift during the course of the year. A little risk is not entirely a bad thing — especially since the allowed risk is easily controlled in the model.

Certainly what we really want to avoid is a model that recommends inferior solutions. In a two-dimensional example that would be a solution that had the same risk value but more miles associated with movement than are needed.

Numerical Examples
Realize that as time moves on the equipment example is a model that must be rerun repeatedly. New commitments are made and existing ones may have been modified. Those solutions that have attractive summary performance measures are the ones that should be examined in detail. Ones with inferior solutions can be safely ignored. In this application the cost settings seem pretty stable – produce good solutions. But there is nothing to prevent rerunning the DOE periodically to examine solutions based on their summary performance measures.

The table below (click to enlarge) shows a set of results from a point early in the year for a subset of the equipment to be scheduled for the coming year. All of the values have been changed, but patterns have been preserved for illustration purposes. In this example we chose just three of the possible six (shown in yellow) parameters that could be varied. We were able to run all eight possible combinations of the values shown. We could investigate a larger set of parameters, perhaps by using fractional experiments that look for important effects without doing all the possible runs.


Without actually estimating the effects of the parameters you can see at a glance most of the effects. High upgrade costs lead to fewer upgrades, high risk costs lead to smaller number of risk days. High costs per route lead to a set of routes with larger total miles. One could argue that run one or run three yields a schedule which is a good tradeoff among the performance measures we considered here.

So a user would be encouraged to take a look at the details of the solutions (scheduling) from Run 1 or Run 3. Finally, notice that there is an interaction between RouteParm and RiskParm on the RiskDays performance measure. That is, changing the RiskParm value has a different effect on Risk Days depending on the setting of RouteParm – much more influence when RouteParm is large than when RouteParm is small. Another benefit of DOE is the ability to spot those kinds of interactions. One-at-a-time experimentation is completely incapable of finding this sort of information.

"http://profitpt.s3.amazonaws.com/wp-content/uploads/old/chart1-799962.png">In the heavy equipment case we had some ideas on the neighborhood to be investigated for the parameter settings. That is not always the case. Sometimes the parameters are very difficult to calibrate by reasoning from or comparison with costs that are known. For instance, we have an example from agriculture. Here the model suggests the order in which a group of farms will be harvested. Processing plants want the products in specific weight ranges and require amounts of each product for each day over the planning horizon.

Besides getting within the desired weight range there are some other goals. For instance, it would be good to hit the weight targets – not simply be within the range. We also want to reduce the number of trucks that have to be dispatched and the number of miles driven. We also want to visit individual farms a limited number of times. Furthermore, if product is not harvested while in an allowed range, it is ‘wasted’ and this is one of the most costly penalties – almost an absolute requirement. Realize that as time goes on the product is always growing and so there is a limited time window on the weight range.

Not all of a given the product on a farm is of the same weight. It varies from section to another based on the time it started growing. So in a simplest of experiments you can test the tradeoffs between distance traveled and missing the weight targets. If you are willing to visit several times, you can come closer to hitting the target weight. But this will involve more visits and usually more total mileage.

Predictions
If one goes the extra step of fitting a statistical model to the DOE results, then predictions can be made for the performance measures. When multiple performance measures are involved, one can locate regions where all performance measures are in acceptable ranges. Various statistical packages, e.g. JMP or Minitab, provide the ability to create two-dimensional contour plots that show regions where both dimensions are within ranges defined by the user.



The three figures above are examples of contour plots. Given that the experiment only examined two levels of each factor, none of the plots can really show curvature. But you can see an interaction in the Miles graph.

Conclusion
Not every model requires an experimental design. But in cases where there are multiple performance measures that need to be combined using (possibly) arbitrary weights, DOE is an excellent approach. This is true for simulation and optimization models alike. Furthermore, DOE can be used as a predictive tool. If the design is carefully chosen, it can guide you to a useful operating region and reveal interactions between the various factors.

So how do you nail Jell-O to the wall? You throw a Design of Experiments net over it.

This article was written by Joe Litko, Profit Point’s Business Optimization Practice Leader.

To learn more about our business optimization services, contact us here or call (866) 347-1130.

Profit Point has recently added another tool for analyzing and managing businesses and supply chains – system dynamics modeling. System dynamics focuses on feedback effects in complex systems. That is what distinguishes it from other simulation and modeling techniques. Feedback means that although X influences Y, it is also true that Y influences X even if this influence is mediated by a string of causal relations.

Here is a typical System Dynamics drawing from Wikipedia. In this diagram there is a flow of individuals from a category called potential adopters into a category called adopters. The rate of adoption is controlled by the ‘valve’ called ‘new adopters’. Many simulation modeling approaches would treat this as a ‘one-way street’ and attempt to model the system by controlling the valve with some external data in the hope of reproducing behaviors observed in the real system.

The point of System Dynamics is to explicitly include the feedback which causes the observed behavior. The model above could be a portion of a model of product adoption for the pharmaceutical industry. Another classic example is a predator – prey population model. When prey are plentiful, the predator population increases, which reduces the prey, which in turn reduces the predators. The extension of this classic model to a system of a business and its resources is fairly natural.

The System Dynamics approach has also been widely applied in typical supply chain scenarios. The effects of feedback can be studied in the context of inventory management and shipping policies. The figure below is of a classic System Dynamics approach to modeling the supply chain from raw materials production through shipment of the finished goods. Feedback is represented by the directed arrows from upstream processes back to the downstream processes. In many cases these represent control mechanisms meant to keep the system in balance or within limits. Successful feedback usually dampens oscillatory behavior – stock-outs, long supply, or obsolete inventory items, for instance.

System dynamics models are classified as continuous simulation models and are typically built in languages specifically designed for these types of models. They are quite different from the usual simulation models that are based on discrete entities and systems that jump from event to event in time. System dynamics models are based on equations that describe (usually) continuous flows of material, people, money, influence, etc. The models are highly useful as policy analysis tools – revealing the often unintended consequences of business rules. The models can examine periods of years or decades, while including external effects, e.g. the economy, and all the significant interactions within a business and with the outside world.

We look forward to bringing this exciting, powerful, and well-established analysis tool to bear on problems and opportunities of our clients.

This article was written by Joe Litko, Profit Point’s Business Optimization Practice Leader.

To learn more about our Business Optimization services, contact us.

Leveraging Profit Point’s supply chain optimization methodologies, Toyota North American Part Center California improves efficiency and quality of their workload planning sequencing process to receive containers from Japan.

North Brookfield, MA (PRWEB) October 6, 2008

Profit Point today announced that Toyota Motor Sales (TMS), U.S.A., Inc.’s North American Part Center California (NAPCC) has improved its receiving sequencing processes using advanced mathematical optimization techniques. NAPCC is one of the parts distribution centers among TMS’ North American Parts Operations network, which was established to improve local parts sourcing and manage a parts distribution network that supplies all North American Toyota distributors, U.S. Toyota, Lexus and Scion dealers as well as export to parts centers in Japan. NAPCC turned to Profit Point to apply mathematical optimization techniques to further improve their supply chain operations.

“We turned to Profit Point to apply mathematical optimization techniques to further improve our supply chain operations,” Johnnie Garlington, NAPCCs warehouse operations manager. The program supported the increase in daily offload by 16% resulting in labor savings, off-site storage costs and detention expenses.

Profit Point, the leading supply chain optimization company, combines proprietary software with proven optimization techniques to help business managers improve their operations. Profit Point supported NAPCC’s objective to redesign their workload planning process to improve the efficiency and quality of their sequencing processes. Profit Point carried this out by designing and building custom supply chain software to optimize their sequencing processes.

“We were asked to investigate a mathematical approach to solving Toyota NAPCC’s container receiving sequencing process,” said Joe Litko, Profit Point’s Business Optimization Practice Leader. “This was an interesting challenge for several reasons. We needed a cost-effective solution using legacy tools, the model needed to run quickly, be flexible, and give robust solutions that consider several performance measures simultaneously.”

NAPCC had been using a traditional spreadsheet to manually achieve an hourly workload plan. Profit Point reviewed the sequencing process and designed a stand-alone application to smooth out the flow of containers to maximize the daily unload capacity.

“Like most businesses, Toyota NAPCC was using good, traditional operations practices,” said Dr. Alan Kosansky, Profit Point’s President. “But, by combining the right mathematical optimization methods with a clear understanding of the business requirements, we were able to achieve a superior supply chain process for Toyota.”

To learn more about Profit Point’s supply chain optimization software and services, visit www.profitpt.com.

About Profit Point:

Profit Point Inc. was founded in 1995 and is now a global leader in supply chain optimization. The company’s team of supply chain consultants includes industry leaders in the fields infrastructure planning, green operations, supply chain planning, distribution, scheduling, transportation, warehouse improvement and business optimization. Profit Point’s has combined software and service solutions that have been successfully applied across a breadth of industries and by a diverse set of companies, including The Coca-Cola Company, General Electric, Rohm and Haas and Toyota.

Contact:
Richard Guy
Profit Point
(866) 347-1130
http://www.profitpt.com

Ready to optimize?

Call Profit Point at 866. 347. 1130

Contact Us

Meet our Team

Our Clients

Sign up for free newsletter

sign-form

Published articles

  • img4

Recent Posts