Jump to content

Wikipedia:How to create charts for Wikipedia articles

From Wikipedia, the free encyclopedia

PNG version of a graph
SVG version of a graph meant to look similar to the PNG

Graphs, charts, and other pictures can contribute substantially to an article. Here are some hints on how to create a graph. The source code for each of the example images on this page can be accessed by clicking the image to go to the image description page.

Guidelines

[edit]

These should be followed whenever possible.

  1. Use the SVG format.
  2. Upload to Wikimedia Commons using the Commons Upload Wizard.
  3. Be sure to include a licensing tag (GFDL, CC, public domain, etc.).
  4. Don't use any natural language in the plot - numbers and symbols only.
  5. Place descriptive text in the caption. If needed you can also write extended information in the image description page.
  6. Use only the fonts supported by MediaWiki (listed here). Don't convert the text into paths. Use Unicode characters. You can find the complete Greek alphabet on Commons. Check character display after uploading.
  7. Check colorblind display of your chart with Vischeck. See Wikipedia:Manual of Style § Color coding. Use dashed or dotted lines or differently-shaped symbols to identify different objects, in addition to color. Blue can be distinguished from other colors by most color-deficient people. Avoid shadows or cross-hatching.
  8. Include a self-contained script by which you created the plot on the image description page. Ideally someone else can copy and paste and reproduce the same result with minimal effort. Commenting your code is also helpful to make your code more understandable.
  9. Avoid legends - place labels and data explanations directly on the graphic itself.

See also Help:Pictures on how to include them in articles.

Exceptions

[edit]
No SVG output
If you can't use SVG, you can create the plot in a bitmap format but make it very large, for instance 6000×4500 pixels. Increase the proportions as well, for example font size 48 and a line thickness of 17 pixels. Then use software like Photoshop or GIMP to Gaussian blur it at 2 pixels. Finally reduce it down to about 1000 pixels on a side (e.g. 1300×975) using bicubic interpolation. This gives a plot with no jagged lines that is also big enough so that someone could download it and use it for projection purposes without apparent pixellation. Save the image as PNG.
Non-free license
There may be rare situations where it is acceptable to use the Wikipedia:File upload wizard.
Plot requires natural language
The main reason to avoid natural language is so that the plots may be used in any language version of Wikipedia. It is fine if the plot requires it, but considering naming it "Plot en.svg" to encourage translation. If you do need to use text, keep it clear, precise, and modest. Spell out words. Write them in your standard language direction and sentence case, with serifs. Use Unicode. Here are some additional recommendations for text legibility.
Symbol font or other specialized font
The most common problems are with characters in the Symbol font which does not display correctly in MediaWiki. If, for instance, "π" is rendered as "p" in Mediawiki's output, your software is generating characters with the Symbol font and needs to be reconfigured. MediaWiki's fonts have good Unicode coverage so it is generally possible to replace these with proper Unicode characters. As a last resort, you may convert the text into paths or use PNG.
Plot requires color
Consider making multiple plots to display the information carried by color, or removing information from the plot. If space considerations mean that color coding is the only way to concisely differentiate parts of the graph, ensure the description provides sufficient information that colorblind users can still guess the colors and understand the picture.
Script is not reproducible
It is fine if your script requires specialized libraries, or data that is encumbered or too large to include - simply note this in the script and provide as much information as is feasible.

Design

[edit]

Edward Tufte, in his book The Visual Display of Quantitative Information, distinguishes between friendly and unfriendly data graphics as follows:

Friendly Unfriendly
  • words are spelled out, mysterious and elaborate encoding avoided
  • words run from left to right, the usual direction for reading occidental languages
  • little messages help to explain data
  • elaborately encoded shadows, cross-hatching, and colors are avoided; instead, labels are placed on the graphic itself; no legend is required
  • graphic attracts viewer, provokes curiosity
  • colors, if used, are chosen so that the color-deficient and color-blind (5 to 10 percent of viewers) can make sense of the graphic (blue can be distinguished from other colors by most color-deficient people)
  • type is clear, precise, modest
  • type is upper-and-lower case, with serifs
  • abbreviations abound, requiring the viewer to sort through text to decode
  • words run vertically, particularly along the Y-axis; words run in several different directions
  • graphic is cryptic, requires repeated references to scattered text
  • obscure codings require going back and forth between legend and graphic
  • graphic is repellent, filled with chartjunk
  • design insensitive to color-deficient viewers; red and green used for essential contrasts
  • type is clotted, overbearing
  • type is all capitals, sans serif

Plotting

[edit]

gnuplot

[edit]

Many of the graphs on Wikipedia were made with the free software program gnuplot. It can be used by itself or in conjunction with other software.

For example, to plot the data in file "data":

set xlabel "steps"
set ylabel "result"
unset key
#use bars in plot: with boxes
#choose line color/style in plot: linetype n
#plot filled bars (fs): pattern n
set style fill pattern 2
plot "data" with boxes linetype 3 fs

There is additional discussion of plotting with gnuplot on Template talk:Probability distribution § Standard Plots.

A plot of Hermite polynomials, generated by gnuplot in SVG format
A plot of the floor function, generated by gnuplot in SVG format

Now that MediaWiki supports SVG, it's usually best to generate SVG images directly. SVG images have many advantages, like being fully resizable, easier to modify, and so on, though they are sometimes inferior to raster images. Decide on a case-by-case basis.

A typical plot file could start with:

 set terminal svg enhanced size 1000 1000 fname "Times" fsize 36
 set output "filename.svg"
size
Sets the size of the plot. This controls the size of features in the PNG rendered by Wikipedia.
fname
Sets the font
fsize
Sets the font size. Also sets the size of plotted points
set output
Sets the filename for saving the SVG information
A plot of the normal distribution, generated by gnuplot

Gnuplot can also generate raster images (PNG):

For the best results, a PostScript file should be generated and converted into PNG in an external program, like the GIMP. PostScript is generated with the line set terminal postscript enhanced:

 set terminal postscript enhanced color solid lw 2 "Times-Roman" 20
 set output "filename.ps"
color
Make a color plot instead of black-and-white
solid
Make all lines solid instead of dashed. You may want to remove this to make dashed lines which are distinguishable on both color and black and white versions of the same plot.
lw 2
Sets the linewidth of all the lines at once.
"Times-Roman" 20
Sets the font and font size
set output
Sets the filename for saving the Postscript information

You should use a large number of samples for high-quality plots:

 set samples 1001

This is important to prevent aliasing or jagged linear interpolation (see File:Exponentialchirp.png and its history for an example of aliasing). Labels are helpful, but remember to keep language-specific information in the caption if it's not too inconvenient. Including the source code and/or an image without text helps other users create versions in their own language, if text is included in the image.

 set xlabel "Time (s)"
 set ylabel "Amplitude"

The legend or key is positioned according to the coordinate system you used for the graph itself:

 set key 4,0

Most other options are not Wikipedia-graph-specific, and should be gleaned from documentation or the source code included with other plots. An example of a plot generated with gnuplot is shown on the right, with source code on the image description page.[1]

Maxima

[edit]
A plot of the Hilbert transform of a square wave, generated by gnuplot from Maxima

Maxima is a computer algebra system licensed under the GPL, similar to Mathematica or Maple. It uses gnuplot as its default plotter, though others are available, such as openmath. Plotting directly to PostScript from Maxima is supported, but gnuplot's PostScript output is more powerful.

The most-used commands are plot2d and plot3d:

 plot2d (sin(x), [x, 0, 2*%pi], [nticks, 500]);
 plot3d (x^2-y^2, [x, -2, 2], [y, -2, 2], [grid, 12, 12]);

Since the plot is sent to gnuplot as a series of samples, not as a function, the Maxima nticks option is used to set the number of sampling points instead of gnuplot's set samples. Additional plot options are included in brackets inside the plot command. To use the same options as in the above gnuplot example, add these lines to the end of the plot command:

PostScript output:

[gnuplot_term, ps]
[gnuplot_ps_term_command, "set term postscript enhanced color solid lw 2 'Times-Roman' 20"]

SVG output:

[gnuplot_term, ps]
[gnuplot_ps_term_command, "set terminal svg enhanced size 1000 1000 fname 'Times' fsize 36"]

Output filename:

[gnuplot_out_file, "filename.ps"]

Additional gnuplot commands:

[gnuplot_preamble, "set xlabel 'Time (s)'; set ylabel 'Amplitude'; set key 4,0"]

Like so:

  plot2d (sin(x), [x, 0, 2*%pi], [nticks, 500], [gnuplot_term, ps],
          [gnuplot_ps_term_command, "set term postscript enhanced color solid lw 2 'Times-Roman' 20"],
          [gnuplot_out_file, "filename.ps"],
          [gnuplot_preamble, "set xlabel 'Time (s)'; set ylabel 'Amplitude'; set key 4,0"]);

Similar for svg output:

  plot2d (sin(x), [x, 0, 2*%pi], [nticks, 500], [gnuplot_term, ps],
          [gnuplot_ps_term_command, "set terminal svg enhanced size 1000 1000 fname 'Times' fsize 36"],
          [gnuplot_out_file, "filename.svg"]);

Note that the font and labels are in single quotes now, nested inside double quotes. Multiple commands are separated by semicolons.

An example of a plot generated with gnuplot in Maxima is shown on the right, with source code on the image description page.[2]

GNU Octave

[edit]

GNU Octave is a numerical computation program; effectively a MATLAB clone. It uses gnuplot extensively (though also offers interfaces to Grace and other graphing software).

The commands are plot (2D) and splot (surface plot).

A graph of the envelope of a wave in GNU octave and gnuplot
 t = [0 : .01 : 1];  
 y = sin (2*pi*t);

 plot (t, y, "linewidth", 2)
 xlabel("Time (s)");
 ylabel("Amplitude");
 print("filename.svg", "-color", "-solid", "-tight", "-FTimes-Roman:20")

Matplotlib

[edit]

Matplotlib is a plotting package for the free programming language Python. Its pyplot interface is procedural and modeled after MATLAB, while the full Matplotlib interface is object-oriented.

Python and Matplotlib are cross-platform, and are therefore available for Windows, OS X, and the Unix-like operating systems like Linux and FreeBSD.

Matplotlib can create plots in a variety of output formats, such as PNG and SVG. Matplotlib mainly does 2-D plots (such as line, contour, bar, scatter, etc.), but 3-D functionality is also available.

A simple SVG line plot with Matplotlib

Here is a minimal line plot (output image is shown on the right):

import matplotlib.pyplot as plt
import numpy as np
a = np.linspace(0, 8, 501)
b = np.exp(-a)
plt.plot(a, b)
plt.savefig("Matplotlib3 lineplot.svg")
plt.show() # show plot in GUI (optional)

Save this script as e.g. lineplot.py and then run it with python lineplot.py. After a few seconds, a window with the interactive graphical output should pop up, and the SVG will also be in the folder.

Numerous examples with Python source code are available, for example the Matplotlib gallery and commons:Category:Valid SVG created with Matplotlib code.

Wikimedia SVG Chart

[edit]
A SVG plot with Wikimedia SVG Chart.

Wikimedia SVG Chart is a graph generator using the templates functionality of Wikimedia Commons. This template generates line and point charts in a structured and readable svg format. The original values are provided unmodified for the SVG file. Therefore the data of the chart may be checked and added at any time directly in the native file with any text editor.

Instructions for a simple line plot:

{{SVG Chart
| Title         = Demo
| XAxisText     = foo
| YAxisText     = bar
| LegendType    = none
| XMax          = 60
| YMax          = 40
| XAxisMarkStep = 10
| YAxisMarkStep = 10
| Graph1Values =
      0 30
     20 10
     40 40
     60 30
| Graph2Values =
      0  0
     20 30
     40 10
     60 30   
}}

Xfig

[edit]

Xfig is an open source vector graphics editor that runs under X on most Unix platforms. In xfig, figures may be drawn using objects such as circles, boxes, lines, spline curves, text, etc. It is possible to import images in many formats, such as GIF, JPEG, SVG, and EPSF. An advantage of Xfig consists in its ability to display nice mathematical formula in the labels and legends using the TeX language.

R

[edit]
an example of a non-antialiased PNG scatterplot created by R

The free statistical package R (see R programming language) can make a wide variety of nice-looking graphics. It is especially effective to display statistical data. On Wikimedia Commons, the category Created with R contains many examples, often including the corresponding R source code. Other examples can be found in the R Graph Gallery.

In order to output postscript, use “postscript” command:

postscript(file = "myplot.ps")
plot(...)
graphics.off()

The last command will close the postscript file; it won't be ready until it's closed.

With an additional (free) package, it's also possible to generate SVG-graphs with R directly. See an example with code on Image:Circle area Monte Carlo integration2.svg.

Other packages (lattice, ggplot2) provide alternative graphics facilities or syntax.

Here is another example with data.

Gri

[edit]

The Gri graphical language can be used to generate plots and figures using a script-like commands. Unlike other tools Gri is not point and click, and requires learning the Gri script syntax.

Maple

[edit]

Maple is a popular proprietary computer algebra system. Maple can export graphs in Encapsulated PostScript format, which can then be converted to SVG for example in Inkscape. To do this using the standard GUI interface, follow these steps:

  1. Display the graph and adjust it until it looks like you want it to.
  2. Right-click on the graph and select "Export" → "Encapsulated Postscript" from the menu which appears. Choose a file name to save the graph as.
  3. In Inkscape, import the graph using "File" → "Import...". After importing, select "File" → "Document Properties..." and click "Fit page to selection". Save the SVG file and upload it.

Dynamic geometry

[edit]

GeoGebra

[edit]
GeoGebra can be used to plot curves and points, as well as experiment with and draw geometric shapes. It also exports to SVG.
GeoGebra's image export dialog. Note that this dialog may be missing as of GeoGebra version 6.

GeoGebra is a dynamic geometry program that can be used to create geometric objects free-hand using compass-and-ruler tools. It can also be used to plot implicit curves, parametric curves, and loci of points. It supports SVG, PNG, EPS, PDF, EMF, PGF/TikZ and PSTricks as export formats and has support for LaTeX formulas within text objects.

GeoGebra is not a drawing tool,[3][4] and therefore suffers from some caveats that people accustomed to programs such as Inkscape or Illustrator might not be expecting.[5] However, if your requirements with regard to pixel-perfect results are not too stringent, then you can quickly and easily create graphs and diagrams in GeoGebra.

If you want to set the dimensions (in pixels) of the graphical output as close to exact as possible, you should start by referring to the instructions in this discussion, and note the observations in this discussion. I.e. the resulting image may still be off by a few pixels. Alternatively, you can export to SVG and fix the file using a text editor. Having done this, you can then use Inkscape to convert the SVG file to PNG or JPEG.

C.a.R.

[edit]
C.a.R.'s export dialog is somewhat more versatile than GeoGebra's, though pixel-perfect results are still tricky to achieve.

C.a.R. (standing for "compass and ruler") is very similar to GeoGebra in that both programs are free, point-and-click, dynamic geometry applications running under Java and supporting PNG, SVG and other output formats. It is not nearly as feature-rich as GeoGebra, but at the same time overcomes some of GeoGebra's limitations with respect to vector and raster image export.[6]

Surfaces & solids

[edit]

POV-Ray

[edit]
A rendering of several three-dimensional solids done using the ray-tracer, POV-Ray (left). Using an updated "screen.inc" it is possible to get precise 2D screen coordinates of 3D points and import them into programs like Inkscape to create arrows, labels and other 2D elements (right).

POV-Ray is a free general-purpose constructive solid geometry ray-tracing package with a scene description language very similar to many programming languages. It can also render parametric surfaces and algebraic surfaces of degree up to seven, as well as triangle mesh approximations using the "mesh" and "mesh2" object types and "param.inc". An updated version of the file "screen.inc" can be used to output the exact two-dimensional screen coordinates of any three-dimensional object so as to facilitate the addition of labels or other 2D elements in post-processing, for instance in Inkscape.

Other surface tools

[edit]

Other usable tools include:

  • surf, which is specialized for algebraic curves and surfaces;
  • surfex, which is built on top of surf.

These tools are only capable of producing raster output.

  • Blender (software) is a free triangle-based 3D modeler. It's possible to create mathematical surfaces in tools such as K3DSurf and import them into Blender. It's also possible to export Blender renders to SVG using a third-party plugin.

Figures, diagrams & charts

[edit]
A diagram created with Graphviz (left). Illustration of Desargues' theorem made using Inkscape (right).

Graphviz

[edit]

For graph-theory diagrams and other "circles-and-arrows" pictures, Graphviz is quick and easy, and also able to make SVGs.

Inkscape

[edit]

Next to being useful for post-processing (see the next section), Inkscape is a point-and-click tool that can be used to create high-quality figures. It is a particularly easy tool for creating vector graphics, though GeoGebra and C.a.R. may be better suited for mathematical graphics. Also, Inkscape's design concepts differ in some fundamental ways from SVG. For instance, in SVG element widths are applied before stroke widths, whereas in Inkscape they are applied after.

LibreOffice and Apache OpenOffice

[edit]
Bar chart created with OpenOffice.org Calc based on data stored in a spreadsheet.
Screenshot of OpenOffice.org Draw. Features may not be as numerous or advanced as in Inkscape, though integrating data from other OpenOffice.org applications is a plus.

LibreOffice and Apache OpenOffice are two free office suites (both forked from the now-discontinued OpenOffice.org suite) that contain among other things means of creating line, bar and pie charts based on data contained in spreadsheets and databases, as well as a program for drawing vector graphics called Draw. There is also a plugin for importing SVG images into OpenOffice.org and SVG import and export is included in current LibreOffice Draw by default. As of 2010, support for the full range of options offered by Inkscape and many other vector formats was still preliminary at best.[7][8]

Gnumeric

[edit]
Population chart in SVG format produced by export from Gnumeric

Gnumeric is a fairly lightweight spreadsheet and charting application, part of the GNOME Free Software Desktop Project. It is available for Linux and other Unix-like systems, as well NT-based versions of Windows. Charts are generated by the usual method of selecting a data range and clicking a toolbar icon. The approach is to start minimally, but double clicking the chart opens a tabbed dialogue, giving a high level of control over all elements of the chart, which are arranged in a hierarchical, nested structure. New elements can be added by clicking on the appropriate level and the Add button. The most important feature from the point of view of Wiki charts, is that Gnumeric charts can be exported as graphics simply by right clicking anywhere on the finished chart and selecting Save as Image. A range of formats is supported, including SVG and PNG. The accompanying population chart is a typical result.

Post-processing

[edit]

Modifying SVG images

[edit]

SVG images can be post-processed in Inkscape. Line styles and colors can be changed with the Fill and Stroke tool. Objects can be moved in front of other objects with the ObjectRaise and Lower menu commands.

Saving from Inkscape also adds information that isn't present in Gnuplot's default output – neither Firefox nor Mozilla will render the file natively without it. These browsers can be persuaded to render Gnuplot's SVG output if the <svg> tag has the following attribute: xmlns="http://www.w3.org/2000/svg", as described at the Mozilla FAQ.

Converting PostScript to SVG

[edit]
pstoedit -f plot-svg Picture.ps Picture.svg

Direct SVG output is probably better if the program supports it. See Wikipedia:WikiProject Electronics/How to draw SVG circuits using Xcircuit for an example.

Editing PostScript colors and linestyles manually

[edit]

Setting colors and linestyles in gnuplot is not easy. They can more easily be changed after the PostScript file is generated by editing the PostScript file itself in a regular text editor.

This avoids needing to open in proprietary software, and really isn't that difficult (especially if you are unfamiliar with other PS editing software).

Find the section of the .ps file with several lines starting with /LT. Identify the lines easily by their color ("the arrow is currently magenta and I want it to be black. Ah, there is the entry with 1 0 1, red + blue = magenta") or by using the gnuplot linestyle−1 (for instance, gnuplot's linestyle 3 corresponds to the ps file's /LT2). Then you can edit the colors and dashes by hand.

 /LT0 { PL [] 1 0 0 DL } def

/LT0 corresponds to gnuplot's linestyle 1. The [] represents a solid line. 1 0 0 is the color of the line; an RGB triplet with values from 0 to 1. This line is red.

 /LT2 { PL [2 dl 3 dl] 0 0 1 DL } def

/LT2 corresponds to gnuplot's linestyle 3. The [2 dl 3 dl] represents a dashed line. There are 2 units of line followed by 3 units of empty space, and so on. 0 0 1 represents the color blue.

 /LT5 { PL [5 dl 2 dl 1 dl 2 dl] 0.5 0.5 0.5 DL } def

/LT5 corresponds to gnuplot's linestyle 6. The [5 dl 2 dl 1 dl 2 dl] represents a dash-dot line. There are 5 units of line (the dash) followed by 2 units of empty space, 1 unit of line (the dot), 2 more units of empty space, and then it starts over again. 0.5 0.5 0.5 represents the color gray.

/LTb is the graph's border, and /LTa is for the zero axes.[9]

Converting PostScript to PNG and editing with the GIMP

[edit]

To post-process PostScript files for raster output (though vector is preferred):[10]

  1. Open the file in the GIMP (make sure you have ghostscript installed! — Windows Ghostscript installation instructions)
    • Enter 500 in the "resolution" input box
    • You may need to uncheck "try bounding box", since the bounding box sometimes cuts off part of the image.
      • Enter large values for Height and Width if not using the bounding box
    • Select color
    • Select strong anti-aliasing for both graphics and text
  2. Crop off extra whitespace (shift+C if you can't find it in the toolbox)
  3. ImageTransform → Rotate 90 degrees clockwise
  4. FiltersBlurGaussian blur (No need to blur if you use strong anti-aliasing during conversion. No significant difference between end results.)
    • 2.0 px
  5. ImageScale Image...
    • 25%
    • Cubic interpolation
  6. You can view at normal size if you want by pressing 1, Ctrl+E
  7. Save as File_name.png

Converting PostScript to PNG with ImageMagick

[edit]

Another route to convert a PS or EPS file (postscript) in PNG is to use ImageMagick, available on many operating systems. A single command is needed:

convert -density 300 file.ps file.png

The density parameter is the output resolution, expressed in dots per inch. With the standard 5x3.5in size of a gnuplot graph, this results in a 1500x1050 pixels PNG image. ImageMagick automatically applies antialiasing, so no post-processing is needed, making this technique especially suited to batch processing. The following Makefile automatically compiles all gnuplot files in a directory to EPS figures, converts them to PNG and then clears the intermediate EPS files. It assumes that all gnuplot files have a ".plt" extension and that they produce an EPS file with the same name, and the ".eps" extension:

GNUPLOT_FILES = $(wildcard *.plt)
# create the target file list by substituting the extensions of the plt files
FICHIERS_PNG = $(patsubst %.plt,%.png,  $(GNUPLOT_FILES))

all: $(FICHIERS_PNG)

%.eps: %.plt
	@ echo "compillation of "$<
	@gnuplot $<

%.png: %.eps
	@echo "conversion in png format"
	@convert -density 300 $< $*.png 
	@echo "end"

See also

[edit]

References

[edit]