Softpanorama

Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Skepticism and critical thinking is not panacea, but can help to understand the world better

Excel 2003 Macro Programming

News Office Recommended Books Recommended Links Recommended Articles Reviews
Selected Excel Functions VBA User-defined functions (UDF) Excel Object model VBA Books R programming language
Retirement modeling Excel formulas Simple 401K portfolio modeling Tips Humor Etc

Excel is not only commodity spreadsheet it is also a powerful development platform. You can develop VBA custom applications where the user can't see the underlying application (Excel). The best version was probably Excel2003. Many users prefer it to latest versions.

Excel can help to solve simple problems quicker and provide better visual feedback loop, then with a standalone procedural language like R or an expensive application such as Matlab or Mathematica. Since Excel has such advanced computational capabilities and is ubiquitous in offices and homes, it would make sense to learn the capabilities of this application.

While this product is marketed by Microsoft as user-oriented, but in reality it has "professional" capabilities. and is suitable for solving wide range of numerical problem and can perform pretty complex financial simulations including simple statistical analysis tasks such as correlating data.

You can also perform curve fitting and regression, solve equations, perform integration and differentiation, and solve both ordinary and partial differential equations. Some linear programming optimization tasks also can be done with Excel.

There are very few high quality books written about Excel. Most books including a dozen of books from O'Reilly are weak. Moreover there is also tremendous amount of low quality, junk books and some of them are rated pretty highly on Amazon (lemmings effect ;-)

Spreadsheet applications were the killer application that drove the personal computer market. In the early PC days, Lotus 1-2-3 (long with WordPerfect) dominated the productivity software business.

It's probably a safe bet that there's more data stored in Excel spreadsheets than in all the world's databases.

Excel's Data menu offers basic database features, such as sorting, filtering, validation, and data-entry forms. You can quickly import and export data in a variety of formats, including those of database management applications, such as Access. The need for macro-programming in VBA become apparent as soon as you  need for entering, manipulating, and reporting data grow beyond the spreadsheet's basic row-column metaphor.

In cases Excel power is not enough moving to a relational database management system (RDBMS), such as Access might be appropriate. The latter makes creating pretty complex databases with multiple linked tables quick and easy. It also contains Query by Example wizard (called called Query Design view). Access 2003 and 2007 has a collection of wizards to lead you step-by-step through each process involved in developing and using a production-grade database application.

The first step is Excel macro programming is creation of user-defined functions (UDF). Excel makes the creations of UDF really easy and such functions can be a valuable addition to set of standard functions. See

Here is the code to create our user-defined Commission() function:

Function Commission(Fee)
If Fee <= 1000 Then Commission = Fee * 0.1 Else Commission = Fee * 0.05
End Function

You can store UDF in several places which affect you ability to share them:

There are a lot of useful but little known Tips for exacel usage. See Excel Tips. Here is one:

Ctrl+` in Excel switches between Normal view and Formula view.

See also Automation and programmability - Excel - Microsoft Office It contain several very useful excepts form books and article written by Microsoft staffers. Among them

Reviews

[Oct 20, 2010] Microsoft Excel Data Analysis and Business Modeling

John Matlock: Non-Traditional Uses of Excell, May 24, 2005
This book is aimed at a level slightly higher than the total beginner. On page one, for instance he assumes that you understand the statement SUM(A5000:A5049) means. Then he points out that writing SUM(USSales) would make it easier to understand. He then describes how to name a range of cells -- Not exactly what I'd call advanced, but certainly above novice.

The first third, or so, of the book is on these kinds of slightly advanced procedures that you could get out of several Excell books. After that he kind of turns the approach around to things like "Estimating a Demand Curve." This is looking for what you want to do rather than just looking at a technique within Excell.

Many of his chapters have fairly sophisticated subjects, "Incorporating Qualitative Factors into Multiple Regression." In this chapter he then looks at multiple factors that might go into predicting things like predicting sales, predicting the Presidential race - with fairly surprising results.

In a couple of chapters he has a bit of fun, i.e. Chapter 73 is "Picking Your Fantasy Football Team."

Excell is quite a powerful package. It can be used for a lot of things beyond just filling out an expense account. In this book a lot of non traditional uses are described in a very light but informative manner. It is a very well done book.

[Mar 3, 2010] Advanced Excel for Scientific Data Analysis

Slashdot

Learn to turn Excel into something that rivals Mathematica using VBA, brains, and a heaping helping of fortitude.(Jack Herrington's review.)

[Nov 4, 2006] Excel 2003 Formulas Books John Walkenbach

Very weak and expensive cookbook, November 4, 2006
Reviewer: Nikolai N Bezroukov
This is an eclectic collection of various recipies for Excel. This is a strange "lemmings" effect that the book has such a high rating on Amazon.

The book is essentially a badly written cookbook as it does not provide underling mechanics and key ideas behind the Excel formulas. Chapters are more or less disconnected and most of them can be read in any order.

At the end the reader is left with very new good findings that probably are not worth the price of the book.

Pagecount is very deceptive -- considerable part of the book is fluff -- brainless reproduction of basic things that one can find in help and that is not worth even one dollar. Many examples are very trivial and not worth reading.

Tricks like Appendix B are simply disgusting -- the author just copied the listing of functions that has no practical value whatsoever to inflate the pagecount.

Explanations mostly are extremely fuzzy. The author has real talent to make simple things complex and complex things impossible. Also this is just "do like I said" type of cookbook: the author never tries to explain concepts that are used (use of absolute adressing vs relative, the syntax intricacies of the second argument of countif and similar functions, etc)

Also the book suffers from frequent references to previous versions of Excel, which only distract the reader. One can assume that if the reader really wants to use one of the previous versions of Excel he can buy prev. edition of the book and save money.

In few places were things became more interesting they are also incomplete and/or incorrect (creation of your own VBA functions and collection of functions, usage of array functions like frequency, etc).

My impression is the author is just a book writer and does not have rich real world experience with Excel, the experience that is necessary to distinguish between important and trivial things.

As a result he cannot provide the reader any help in getting the grasp of underling architectural ideas, that are often very non-trivial (Excel is extremely powerful analytical tool disguised as a commodity spreadsheet).


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Mar 12, 2021] Microsoft Open Sources Low-Code Power Fx Language - Developer.com

Mar 12, 2021 | www.developer.com

Microsoft is looking to rally a community around a Power Fx low-code programming language that is now an open source project.

Announced at the Microsoft Ignite 2021 conference, the initiative is part of an effort to extend the reach of Power Fx that is already employed within Office 365 to other offerings such as Microsoft Dataverse, Microsoft Power Automate, and Microsoft Power Virtual Agents.

Power Fx traces its lineage back to a pair of Tangram and Siena projects that ultimately gave birth to a programming language that was first widely employed by users of Microsoft Excel spreadsheets.

[Nov 22, 2020] How to Supercharge Excel With Python, by Costas Andreou

Jul 21, 2020 | towardsdatascience.com

As much as Excel is a blessing, it is also a curse. When it comes to smallish enough data and simple enough operations As much as Excel is a blessing, it is also a curse. When it comes to smallish enough data and simple enough operations Excel is king . Once you find yourself endeavoring outside of those zones however, it becomes a pain. Sure enough, you can use Excel VBA to get around such issues, but in the year 2020, you can thank your lucky stars because you don't have to! If only there was a way to integrate Excel with Python to give Excel wings! Well, now there is. A python library called If only there was a way to integrate Excel with Python to give Excel wings! Well, now there is. A python library called If only there was a way to integrate Excel with Python to give Excel wings! Well, now there is. A python library called xlwings allows you to call Python scripts through VBA and pass data between the two .


Why integrate Python with Excel VBA? The truth of the matter is, you can pretty much do anything in VBA. So, if that is the case, why would you want to use Python? Well, there are a number of reasons.
  1. You can create a custom function in Excel without learning VBA (if you don't know it already)
  2. Your users are comfortable in Excel
  3. You can speed up your data operations significantly by using Python
  4. There are libraries for just about anything in Python (Machine Learning, Data Science, etc)
  5. Because you can!!!

Getting Set Up to Use xlwings The first thing we need to do, as with any new library we want to use, is to install it. It's super easy to do; with two commands we'll be set up in no time. So, go ahead and type in your terminal: The first thing we need to do, as with any new library we want to use, is to install it. It's super easy to do; with two commands we'll be set up in no time. So, go ahead and type in your terminal:
pip install xlwings
Once the library has been downloaded and installed, we need to install the Excel integration part. Ensure you've closed down all your Excel instances and in any terminal type: Once the library has been downloaded and installed, we need to install the Excel integration part. Ensure you've closed down all your Excel instances and in any terminal type:
xlwings addin install
Assuming you experience no errors, you should be able to proceed. However, oftentimes on Win10 with Excel 2016, people will see the following error: Assuming you experience no errors, you should be able to proceed. However, oftentimes on Win10 with Excel 2016, people will see the following error:
xlwings 0.17.0
[Errno 2] No such file or directory: 'C:\\Users\\costa\\AppData\\Roaming\\Microsoft\\Excel\\XLSTART\\xlwings.xlam'
If you are one of the lucky ones to experience the above error, all you need to do is create the missing directory. You can do that easily by using the mkdir command. In my case, I did: If you are one of the lucky ones to experience the above error, all you need to do is create the missing directory. You can do that easily by using the mkdir command. In my case, I did:
mkdir C:\\Users\\costa\\AppData\\Roaming\\Microsoft\\Excel\\XLSTART
Assuming the successful installation of the excel integration with the python library, the main difference you will immediately notice is in Excel: Assuming the successful installation of the excel integration with the python library, the main difference you will immediately notice is in Excel:
Enabling User Defined Functions for xlwings First up, we need to load the Excel Add-in. You can hit Alt, L, H and then navigate to the directory above to load the plugin. Once you're done, you should be able to see the following: First up, we need to load the Excel Add-in. You can hit Alt, L, H and then navigate to the directory above to load the plugin. Once you're done, you should be able to see the following:
Finally, you need to Enable Trust access to the VBA project object model. You can do that by navigating to File > Options > Trust Center > Trust Center Settings > Macro Settings: Finally, you need to Enable Trust access to the VBA project object model. You can do that by navigating to File > Options > Trust Center > Trust Center Settings > Macro Settings:

Photo by Pakata Goh on Unsplash
Getting Started with xlwings There are two main ways you can go from Excel to Python (and back). The first one is to call a Python script directly from VBA, while the other one is through a User Defined Function. Let us have a quick look at both. In order to avoid any confusion and to have the correct set up every time, xlwings offers to create your Excel spreadsheet, ready to go. Let us then use this functionality. Using the terminal, we navigate to the directory we like and type: There are two main ways you can go from Excel to Python (and back). The first one is to call a Python script directly from VBA, while the other one is through a User Defined Function. Let us have a quick look at both. In order to avoid any confusion and to have the correct set up every time, xlwings offers to create your Excel spreadsheet, ready to go. Let us then use this functionality. Using the terminal, we navigate to the directory we like and type: In order to avoid any confusion and to have the correct set up every time, xlwings offers to create your Excel spreadsheet, ready to go. Let us then use this functionality. Using the terminal, we navigate to the directory we like and type: In order to avoid any confusion and to have the correct set up every time, xlwings offers to create your Excel spreadsheet, ready to go. Let us then use this functionality. Using the terminal, we navigate to the directory we like and type:
xlwings quickstart ProjectName
I am calling this MyFirstPythonXL. The above command will create a new folder in your pre-navigated directory with an Excel worksheet and a python file. I am calling this MyFirstPythonXL. The above command will create a new folder in your pre-navigated directory with an Excel worksheet and a python file.
Opening the .xlsm file, you immediately notice a new Excel sheet called Opening the .xlsm file, you immediately notice a new Excel sheet called _xlwings.conf . Should you wish to override the default settings of xlwings, all you have to do is rename this sheet and remove the starting underscore. And with that, we are all set up and ready to begin using xlwings.
VBA to Python Before we jump into the coding, let us first ensure we are all on the same page. To bring up our Excel VBA editor, hit Before we jump into the coding, let us first ensure we are all on the same page. To bring up our Excel VBA editor, hit Alt + F11 . This should return the following screen:
VBA editor with xlwings
The key things to note here is that this code will do the following: The key things to note here is that this code will do the following:
  1. Look for a Python Script in the same location as the spreadsheet
  2. Look for a Python Script with the same name as the spreadsheet (but with a .py extension)
  3. From the Python Script, call the function "main()"
Without any further ado, let us look at a few examples of how this can be used. Without any further ado, let us look at a few examples of how this can be used. Example 1: Operate Outside of Excel, and return the Output In this example, we will see how you carry operations outside of Excel, but then return your results in the spreadsheet. This can have an infinite amount of use cases. We will source data from a CSV file, do a modification on said data, and then pass the output to Excel. Let's review how easy it is: First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : In this example, we will see how you carry operations outside of Excel, but then return your results in the spreadsheet. This can have an infinite amount of use cases. We will source data from a CSV file, do a modification on said data, and then pass the output to Excel. Let's review how easy it is: First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : We will source data from a CSV file, do a modification on said data, and then pass the output to Excel. Let's review how easy it is: First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : We will source data from a CSV file, do a modification on said data, and then pass the output to Excel. Let's review how easy it is: First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : I have left this completely unchanged from the default. Then, the Python code : I have left this completely unchanged from the default. Then, the Python code : Then, the Python code : Then, the Python code :
import xlwings as xw
import pandas as pddef main():
    wb = xw.Book.caller()
    df = pd.read_csv(r'C:\temp\TestData.csv')
    df['total_length'] =  df['sepal_length_(cm)'] + df['petal_length_(cm)']
    wb.sheets[0].range('A1').value = df
Which results in the following: Which results in the following:
xlwings in action
Example 2: Use Excel Inputs to Drive the Operations In this example, we will read inputs from Excel, do something with it in Python, and then pass the result back to Excel. More specifically, we are going to read a Greeting, a Name and a file location of where we can find jokes. Our Python script will then take a random line from the file, and return us a joke. First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : In this example, we will read inputs from Excel, do something with it in Python, and then pass the result back to Excel. More specifically, we are going to read a Greeting, a Name and a file location of where we can find jokes. Our Python script will then take a random line from the file, and return us a joke. First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : More specifically, we are going to read a Greeting, a Name and a file location of where we can find jokes. Our Python script will then take a random line from the file, and return us a joke. First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : More specifically, we are going to read a Greeting, a Name and a file location of where we can find jokes. Our Python script will then take a random line from the file, and return us a joke. First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : First up, the VBA code : I have left this completely unchanged from the default. Then, the Python code : I have left this completely unchanged from the default. Then, the Python code : I have left this completely unchanged from the default. Then, the Python code : Then, the Python code : Then, the Python code :
import xlwings as xw
import randomdef random_line(afile):
    line = next(afile)
    for num, aline in enumerate(afile, 2):
      if random.randrange(num): continue
      line = aline
    return line
'Function from: stackoverflowdef main():
    wb = xw.Book.caller()
    listloc = str(wb.sheets[0].range('B3').value)
    fhandle = open(listloc, encoding = 'utf-8')wb.sheets[0].range('A5').value = wb.sheets[0].range('B2').value + ' ' + wb.sheets[0].range('B1').value + ' here is a joke for you'
    wb.sheets[0].range('A6').value = random_line(fhandle)
Which gives us: Which gives us:

User-Defined Functions with xlwings In pretty much the same fashion as before, we will be changing the code in the python file. In order to turn something into an Excel User Defined Function, all we need to do is include '@xw.func' before the line the function is on: The Python code: In pretty much the same fashion as before, we will be changing the code in the python file. In order to turn something into an Excel User Defined Function, all we need to do is include '@xw.func' before the line the function is on: The Python code: The Python code: The Python code:
import xlwings as xw@xw.func
def joke(x):
    wb = xw.Book.caller()
    fhandle = open(r'C:\Temp\list.csv')
    for i, line in enumerate(fhandle):
        if i == x:
            return(line)
The result: The result:

Conclusion I think you would agree that this is a nifty little library. If you are like me and you much prefer to work in Python rather than VBA but need to work in spreadsheets, then this can be an exceptional tool. Want to stay up to date with my blogs? Don't forget to I think you would agree that this is a nifty little library. If you are like me and you much prefer to work in Python rather than VBA but need to work in spreadsheets, then this can be an exceptional tool. Want to stay up to date with my blogs? Don't forget to Want to stay up to date with my blogs? Don't forget to Want to stay up to date with my blogs? Don't forget to follow me !

[Aug 08, 2020] Tools for Working with Excel and Python

Aug 08, 2020 | www.pyxll.com

Posted on August 13, 2018 by Tony Roberts
Last updated May 22, 2020

Microsoft Excel is widely used in almost every industry. Its intuitive interface and ease of use for organising data, performing calculations, and analysis of data sets has led to it being commonly used in countless different fields globally.

Whether you're a fan of Excel or not, at some point you will have to deal with it! For many applications you won't want to do complex calculations or manage large data sets in Excel itself, but you may need to take values from Excel as inputs, produce reports in an Excel format, or provide tools to Excel users. Python can be a better choice for complex tasks and fortunately there are many tools for the Python developer to work with so Excel and Python can be used together.

This post gives an overview of some of the most popular and useful tools out there to help you choose which is the right one for your specific application.

Below there's a feature matrix outlining the different features of the packages for calling Python from Excel.

Building Interactive Python Tools with Excel as a Front-End

Excel is a well known and really good user interface for many tasks. When you get into more complex tasks and processing larger datasets however you can soon reach the limits of what can sensibly be achieved in Excel. Python is a popular choice for data science and other disciplines as it can handle these complex cases far better than Excel alone. By using both together and recognising the strengths of each, it's possible for you to build really powerful interactive tools using Excel as a user-friendly front end, with all the heavy lifting done in Python.

Python is an extremely powerful language with an extensive ecosystem of 3rd party libraries. Leveraging Python in Excel spreadsheets can be a fantastic way to enhance your productivity and remove the need for importing and exporting data into and out of Excel. Interactive worksheets can be developed using Python code in the same way as you might use VBA, but with all of the advantages of Python.

There are a few tools available that can be used to bring Python to Excel and it can be difficult to know which one is right for different situations. Below is an overview of each, which I hope will highlight the differences between them and help you decide which ones are right for what you need to achieve.

See the table of features along with the packages that support them below .

PyXLL – The Python Excel Add-In

PyXLL is currently the only package that enables developers to write fully featured Excel addins in Python. It embeds the Python interpreter into Excel so that it can be used as a complete VBA replacement. You can think of it conceptually as being similar to something like Excel-DNA for C#, except that it is dynamic and imports your Python code while Excel is running – so there's no add-in to build and no need to restart Excel when modifying your Python code.

See PyXLL's Features

Using PyXLL, you can write Python code to create:

Writing a user defined function with PyXLL requires the 'xl_func' decorator to be applied to a normal Python function:

from pyxll import xl_func @xl_func def py_test(a, b, c): return (a + b) * c

PyXLL has a config file (pyxll.cfg) which contains a list of all the modules that will be imported when Excel starts. By adding the module above to the list in that file, PyXLL will expose the 'py_test' function to Excel as a user defined function to be called from a worksheet.

Some additional features of PyXLL are:

For more features take a look at the feature matrix below.

Home Page | Download PyXLL | Documentation

Download PyXLL pywin32 / comtypes

The entire Excel API (or Object Model) is exposed via COM. Everything that can be written as a VBA macro can also be written using the Excel COM API in Python by using pywin32 or comtypes.

The Excel COM API can be used from outside of Excel (e.g. from a running Python prompt, script or Jupyter notebook). If you already know how to do something in VBA then doing the equivalent task in Python via the COM API is generally quite straightforward. Calling a routine using pywin32 or comtypes from Excel (e.g. from a button on the ribbon bar, menu item or macro) can be done using PyXLL .

The Excel Object Model is documented here https://docs.microsoft.com/en-gb/office/vba/api/overview/Excel/object-model and once you understand the basic differences between VBA and Python you will find it's fairly simple to translate between the two.

To demonstrate let's go though an example. Suppose you had the following VBA code and want to translate it into Python:

Sub Macro1() Range( 'B11:K11').Select Selection.AutoFill Destination:=Range( 'B11:K16'), Type:=xlFillDefault Columns( 'B:K').Select Selection.ColumnWidth = 4 End Sub

First of all we must get the Excel Application object in Python. This code can be run from an interactive Python prompt or a Jupyter notebook, or even run inside Excel itself using PyXLL .

from win32com.client.gencache import EnsureDispatch # Get the Excel Application COM object xl = EnsureDispatch( 'Excel.Application' )

Now we have the Application object we can call the Range method in the same way as the VBA code above. The first important difference to notice is that in VBA simply calling 'Range().Select' calls the Select method, but in Python we need to use '()' to call the method.

xl. Range ( 'B11:K11' ).Select()

The next line requires a constant, 'xlFillDefault'. To access the same constant in Python we use the 'win32com.client.constants' module. Also notice that in VBA no parentheses are used when calling an object method, but in Python there are.

from win32com.client import constants xl.Selection.AutoFill(Destination = xl. Range ( 'B11:K16' ), Type = constants.xlFillDefault)

The rest of the code is similar to those lines we're just translated, and so the entire function looks like

from win32com.client.gencache import EnsureDispatch from win32com.client import constants def Macro1(): xl = EnsureDispatch( 'Excel.Application' ) xl. Range ( 'B11:K11' ).Select() xl.Selection.AutoFill(Destination = xl. Range ( 'B11:K16' ), Type = constants.xlFillDefault) xl.Columns( 'B:K' ).Select() xl.Selection.ColumnWidth = 4

The translated Python code looks very similar to the original VBA code! Automating tasks in Excel, or just calling it interactively in this way from a Jupyter notebook can be very powerful.

This Python code could be called from Excel as a macro using PyXLL's "@xl_macro" decorator. Instead of using EnsureDispatch, pyxll.xl_app() should be used to ensure that if there are multiple Excel processes running the correct one is returned.

xlwings

xlwings provides a wrapper around the Excel COM API described above for simplifying many common tasks, such as writing Pandas DataFrames to an open Excel workbook. It uses pywin32's COM wrappers and gives you access to those, so you can always drop down to using the normal Excel API should you need to.

In the same way as pywin32 and comtypes, xlwings can talk to Excel from a normal Python prompt or Jupyter notebook. For calling code using xlwings from Excel itself, PyXLL provides a convenient way of getting the Excel Application object as an xlwings object . This allows you to script Excel in Python and trigger running your code from a ribbon button or menu item. An example use-case could be a ribbon button for fetching data from a database, building a report, and writing it straight into the running Excel.

The following shows how values can be read and written to a running Excel workbook, including a Pandas DataFrame.

import xlwings as xw wb = xw.Book( 'workbook.xlsx' ) # Open an existing Workbook sheet = wb.sheets[ 'Sheet1' ] # read and write values from the worksheet sheet. range ( 'A1' ).value = 'Foo' print (sheet. range ( 'A1' ).value) # Write a Pandas DataFrames directly to the Excel sheet import pandas as pd df = pd.DataFrame([[ 1 , 2 ], [ 3 , 4 ]], columns = [ 'a' , 'b' ]) sht. range ( 'A1' ).value = df # Read the DataFrame back, using the 'expand' option to read the whole table sht. range ( 'A1' ).options(pd.DataFrame, expand = 'table' ).value

xlwings includes a way of writing user defined functions (UDFs) or worksheet functions in Python that are called from a formula in Excel, similar to the user defined functions offered by PyXLL . These rely on a server process running outside of Excel and VBA wrappers to call into that server. It's a simple solution with some drawbacks, such as poor performance and that those functions are only available from the workbook containing the VBA wrappers.

DataNitro

DataNitro is another API to control Excel from Python. It's not clear what the advantage over its API and the existing and well understood Microsoft Excel COM API is, but it does allow you to write and run scripts without leaving Excel. It has rudimentary support for user defined functions (worksheet functions), but they run outside of the Excel process and only work if there is only one Excel process running.

DataNitro is no longer under active development and is not available to license any more, but it was included here for completeness.

Feature Matrix For Integrating Python and Excel
Feature DataNitro xlwings PyXLL Comments
Basic worksheet functions DataNitro and xlwings use an external Python process, xlwings requires VBA wrapper code
Real time data Stream real time data into Excel worksheets
Ribbon customisation Give users a rich user experience with custom ribbon menus
Menu functions Call Python code from the Excel menu
Object Cache Pass Python objects between worksheet functions seamlessly via an object cache
IntelliSense IntelliSense tooltip as you type – PyXLL integrates with the ExcelDNA Intellisense Addin
Thread safe worksheet functions Improve worksheet responsiveness by using Excel's own threadpool to run worksheet functions concurrently
Asynchronous functions Don't block Excel waiting for long running functions
Macros Macros are functions that can be attached to UI elements like buttons or called from VBA
Keyboard shortcuts Keyboard shortcuts can be assigned to macros with PyXLL
Macro sheet equivalent functions Call back into Excel from a worksheet function
Function documentation Include Python function docstrings in the Excel function wizard
Automatically resize arrays Array functions can resize automatically
Volatile Functions Volatile functions are called every time a worksheet is recalculated
Full Excel API exposed xlwings uses pywin32, PyXLL users can choose between pywin32, comtypes or xlwings
Reload without restarting Excel Modules can be reloaded without restarting Excel. PyXLL also supports 'deep reloading' where all module dependencies are also reloaded.
Automatic reloading Reload automatically whenever changes are made to your code.
See PyXLL's Features Download PyXLL Reading and Writing Excel workbooks

For some tasks you may need to read or write an Excel file directly. For batch processing or tasks running on a server Excel may not be installed. The following packages allow you to read and write Excel files directly without needing to use Excel.

OpenPyXL

For working with Excel 2010 onwards, OpenPyXL is a great all round choice. Using OpenPyXL you can read and write xlsx, xlsm, xltx and xltm files. The following code shows how an Excel workbook can be written as an xlsx file with a few lines of Python.

from openpyxl import Workbook wb = Workbook() # grab the active worksheet ws = wb.active # Data can be assigned directly to cells ws[ 'A1' ] = 42 # Rows can also be appended ws.append([ 1 , 2 , 3 ]) # Save the file wb.save( 'sample.xlsx' )

Don't confuse OpenPyXL with PyXLL . The two are completely different and serve different purposes. OpenPyXL is a package for reading and writing Excel files, whereas PyXLL is a tool for building fully featured Excel Add-Ins for integrating Python code into Excel.

OpenPyXL covers more advanced features of Excel such as charts, styles, number formatting and conditional formatting. It even includes a tokeniser for parsing Excel formulas!

One really nice feature for writing reports is its built-in support for NumPy and Pandas data. To write a Pandas DataFrame all that's required is the included 'dataframe_to_rows' function:

from openpyxl.utils.dataframe import dataframe_to_rows wb = Workbook() ws = wb.active for r in dataframe_to_rows(df, index = True , header = True ): ws.append(r) wb.save( 'pandas_openpyxl.xlsx' )

If you need to read Excel files to extract data then OpenPyXL can do that too. The Excel file types are incredibly complicated and openpyxl does an amazing job of reading them into a form that's easy to access in Python. There are some things that openpyxl can't load though, such as charts and images, so if you open a file and save it with the same name then some elements may be lost.

from openpyxl import load_workbook wb = load_workbook(filename = 'book.xlsx' ) sheet_ranges = wb[ 'range names' ] print (sheet_ranges[ 'D18' ].value)

A possible downside of OpenPyXL is that it can be quite slow for handling large files. If you have to write reports with thousands of rows and your application is time-sensitive then XlsxWriter or PyExcelerate may be better choices.

XlsxWriter

If you only need to write Excel workbooks and not read them then XlsxWriter is an easy to use package to use that works well. If you are working with large files or are particularly concerned about speed then you may find XlsxWriter a better choice than OpenPyXL.

XlsxWriter is a Python module that can be used to write text, numbers, formulas and hyperlinks to multiple worksheets in an Excel 2007+ XLSX file. It supports features such as formatting and many more, including:

Writing Excel workbooks using XlsxWriter is simple enough. Cells can be written to using the Excel address notation (eg 'A1') or row and column numbers. Below is a basic example that shows creating a workbook, adding some data and saving it as an xlsx file.

import xlsxwriter workbook = xlsxwriter.Workbook( 'hello.xlsx' ) worksheet = workbook.add_worksheet() worksheet.write( 'A1' , 'Hello world' ) workbook.close()

If you are using Pandas then you'll want to use XlsxWriter's Pandas integration. It takes the hard work out of writing Pandas DataFrames to Excel, and even creating charts.

import pandas as pd # Create a Pandas dataframe from the data. df = pd.DataFrame({ 'Data' : [ 10 , 20 , 30 , 20 , 15 , 30 , 45 ]}) # Create a Pandas Excel writer using XlsxWriter as the engine. writer = pd.ExcelWriter( 'pandas_simple.xlsx' , engine = 'xlsxwriter' ) # Get the xlsxwriter objects from the dataframe writer object. workbook = writer.book worksheet = writer.sheets[ 'Sheet1' ] # Create a chart object. chart = workbook.add_chart({ 'type' : 'column' }) # Configure the series of the chart from the dataframe data. chart.add_series({ 'values' : '=Sheet1!$B$2:$B$8' }) # Insert the chart into the worksheet. worksheet.insert_chart( 'D2' , chart) # Convert the dataframe to an XlsxWriter Excel object. df.to_excel(writer, sheet_name = 'Sheet1' ) # Close the Pandas Excel writer and output the Excel file. writer.save()

When referencing the Pandas data in the worksheet (as the formula in the chart above does), you have to figure out where the data will be in the worksheet so that the formulas point to the correct cells. For reports involving a lot of formulas or charts this can become problematic as doing something as as simple as adding an extra row requires adjusting all affected formulas. For reports like that the package 'xltable' can help.

XLTable

XLTable is a higher level library for building Excel reports from pandas DataFrames. Rather than writing the workbook cell by cell or row by row, whole tables are added and can include formulas that reference other tables without having to know ahead of time where those tables will be. For more complex reports involving formulas xltable can be very useful.

The main feature that makes xltable more useful than just writing the Excel files directly is that it can handle tables with formulas that relate to cells in the workbook, without having to know in advance where those tables will be placed on a worksheet. Therefore only when all the tables have been added to the workbook and the workbook is being written are formulas resolved to their final cell addresses.

If you need to write a report that includes formulas rather than just data, XLTable makes it easier by tracking the cell references so you don't have to construct the formulas by hand and worry about references changing when tables grow or new rows or columns are added.

from xltable import * import pandas as pd # create a dataframe with three columns where the last is the sum of the first two dataframe = pd.DataFrame({ 'col_1' : [ 1 , 2 , 3 ], 'col_2' : [ 4 , 5 , 6 ], 'col_3' : Cell( 'col_1' ) + Cell( 'col_2' ), }, columns = [ 'col_1' , 'col_2' , 'col_3' ]) # create the named xltable Table instance table = Table( 'table' , dataframe) # create the Workbook and Worksheet objects and add table to the sheet sheet = Worksheet( 'Sheet1' ) sheet.add_table(table) workbook = Workbook( 'example.xlsx' ) workbook.add_sheet(sheet) # write the workbook to the file using xlsxwriter workbook.to_xlsx()

XLTable can use either XlsxWriter to write an xlsx file, or it can use pywin32 (win32com) to write directly to an open Excel application (Windows only). Writing directly to Excel is good for interactive reports. For example, you could have a button in the Excel ribbon that a user could press to query some data and produce a report. By writing it directly to Excel they can get that report immediately in Excel without having it written to a file first. For details of how to customise the Excel ribbon in Excel see PyXLL: Customizing the Ribbon .

Pandas

For working with ranges of data and reading or writing them to Excel workbooks with no frills, using pandas can be a very quick and effective method. If you don't need much in the way of formatting and just care about getting data into or out of Excel workbooks then the pandas functions "read_excel" and "to_excel" may be just what you need.

df = pd.DataFrame([ ( 'string1' , 1 ), ( 'string2' , 2 ), ( 'string3' , 3 ) ], columns = [ 'Name' , 'Value' ]) # Write dataframe to an xlsx file df.to_excel( 'tmp.xlsx' )

For more complex tasks because XlsxWriter, OpenPyXL and XLTable all have Pandas integration any of those can also be used to write Pandas DataFrames to Excel. But, for just getting data into Excel using Pandas directly as above is very convenient.

xlrd/xlwt

xlrd and xlwt read and write the old Excel .xls files respectively. These are included in this list for completeness, but are now really only used when you are forced to deal with the legacy xls file format. They are both extremely mature packages that are very capable and stable, but xlwt will never be extended to support the newer xlsx/xlsm file formats therefore for new code dealing with modern Excel file formats they are no longer the best choice.

pyexcel

If you need to deal with multiple file formats (eg xlsx, xls, ods or csv) then pyexcel can be used to handle all of them. It wraps some of the packages above (xlrd/xlwt, openpyxl and xlxswriter and others) to give a single consistent API regardless of the file format you are working with.

The pyexcel packages focuses on data rather than formatting, so if you are looking to produce high quality reports for Excel users then you should consider the alternatives, but if you need to extract data from a spreadsheet without worrying so much about the file type then this package will help you do that.

&gt; import pyexcel as pe &gt; records = pe.iget_records(file_name = 'your_file.xls' ) &gt; for record in records: ... print ( '%s is aged at %d' % (record[ 'Name' ], record[ 'Age' ])) Adam is aged at 28 Beatrice is aged at 29 Ceri is aged at 30 Dean is aged at 26 &gt; pe.free_resources()
Additional Resources

[Aug 08, 2020] excel - Using Python to program MS Office macros- - Stack Overflow

Jan 01, 2013 | stackoverflow.com

9

There is a set of cross platform Python utilities - called xlrd, xlwt, and xlutils - for reading & writing Excel files. There are some limitations (e.g. I don't think they can process macros), but they do allow you to work with Excel files on non-Windows platforms, if that's of use to you. See: http://www.python-excel.org/

Also, there are SO questions already dealing with this sort of topic, including this: Is there a better way (besides COM) to remote-control Excel?

Yes, absolutely. You want to use win32com module, which is part of pywin32 ( get it here ).

I've found you can really simplify Python integration by writing a macro in VBA for Python to use, and then just have Python call the macro. It will look something like this:

from win32com.client import Dispatch as comDispatch

xl = comDispatch('Excel.Application')
xl.Workbooks.Open("Macros.xls", False, True)
xl.Run("Macros.xls!Macro_1")

I'm sure there are plently of examples on SO... Like this one .

7

Or have a look at IronPython. IPy is a native .NET implementation of Python 2.6, you can find it at http://www.codeplex.com/ironpython .

We have used it for several projects. You can use it "from the outside" using COM or - as we do - write a Excel AddIn with a ScriptHost, which calls out to IronPython code giving you an environment similar to VBA.

Being a .NET dll, IPy integrates extremely well into the modern Windows .NET stack.


denfromufa , 2013-10-13 14:43:11

here is a useful link:

http://continuum.io/using-excel

  1. Try Pyvot – A Python to/from Excel Connector from Microsoft:

http://pytools.codeplex.com/wikipage?title=Pyvot

I also really like PTVS from the same dev. team, which provides best debugging in Python that I experienced so far.

2.

What you can do with VBA + Python is following:

Compile your py scripts that take inputs and generate outputs as text files or from console. Then VBA will prepare input for py, call the pre-compiled py script and read back its output.

3.

Consider OpenOffice or LibreOffice which support Python scripts.

This is assuming that available options with COM or MS script interfaces do not satisfy your needs.

4.

This is not free approach, but worth mentioning (featured in Forbes and New York Times):

https://datanitro.com

5.

This is not free for commercial use:

PyXLL - Excel addin that enables functions written in Python to be called in Excel.

> ,

add a comment

> ,

This is a question from a long time ago, but I'm putting it here so others can point to it in their search.

One option not discussed here, which I use all the time, is to create a COM server from Python and call it from VBA in any Office application. There is a good tutorial of doing COM server with Python at: http://timgolden.me.uk/pywin32-docs/html/com/win32com/HTML/QuickStartServerCom.html

What you end up with is a COM server (don't forget to make it In Process) that can be created by a call to CreateObject() in VBA. You then call the methods on it as you do with an COM object created with CreateObject() . You can't single step in your Python script, but you can intercept logging with win32traceutil from Pywin32 distribution. Works flawlessly.

[Aug 08, 2020] Get started using Python on Windows for scripting and automation

Aug 08, 2020 | oreilly.com

[Aug 06, 2020] python - Fill cells with colors using openpyxl

May 27, 2015 | stackoverflow.com

Ahmed Rashad , 2015-05-27 13:39:23

I am currently using openpyxl v2.2.2 for Python 2.7 and i wanted to set colors to cells. I have used the following imports

import openpyxl,
from openpyxl import Workbook
from openpyxl.styles import Color, PatternFill, Font, Border
from openpyxl.styles import colors
from openpyxl.cell import Cell

and the following is the code I tried using:

wb = openpyxl.Workbook()
ws = wb.active

redFill = PatternFill(start_color='FFFF0000',
                   end_color='FFFF0000',
                   fill_type='solid')

ws['A1'].style = redFill

but I get the following error:

Traceback (most recent call last)
  self.font = value.font.copy()
AttributeError: 'PatternFill' object has no attribute 'font'

Any idea on how to set cell A1 (or any other cells) with colors using openpyxl?

The6thSense ,

Could you provide the full trace back – The6thSense May 27 '15 at 13:45

Charlie Clark ,

I believe the issue is that you're trying to assign a fill object to a style.

ws['A1'].fill = redFill should work fine.

Callam Delaney ,

Thanks Charlie. I'm quite new to Python and the openpyxl docs were not clear about how to apply the fill to a cell. Your answer has solved the problem. – Ahmed Rashad May 28 '15 at 8:12

[Jul 28, 2020] WORKING WITH EXCEL SPREADSHEETS

Extracted from Automate the Boring Stuff with Python, 2nd Edition Please buy the book. It is really good
Jul 28, 2020 | amazon.com

... ... ...

The openpyxl.load_workbook() function takes in the filename and returns a value of the workbook data type. This Workbook object represents the Excel file, a bit like how a File object represents an opened text file.

Remember that example.xlsx needs to be in the current working directory in order for you to work with it. You can find out what the current working directory is by importing os and using os.getcwd() , and you can change the current working directory using os.chdir() .

Getting Sheets from the Workbook

You can get a list of all the sheet names in the workbook by accessing the sheetnames attribute. Enter the following into the interactive shell:

>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> wb.sheetnames # The workbook's sheets' names.
['Sheet1', 'Sheet2', 'Sheet3']
>>> sheet = wb['Sheet3'] # Get a sheet from the workbook.
>>> sheet
<Worksheet "Sheet3">
>>> type(sheet)
<class 'openpyxl.worksheet.worksheet.Worksheet'>
>>> sheet.title # Get the sheet's title as a string.
'Sheet3'
>>> anotherSheet = wb.active # Get the active sheet.
>>> anotherSheet
<Worksheet "Sheet1">

Each sheet is represented by a Worksheet object, which you can obtain by using the square brackets with the sheet name string like a dictionary key. Finally, you can use the active attribute of a Workbook object to get the workbook's active sheet. The active sheet is the sheet that's on top when the workbook is opened in Excel. Once you have the Worksheet object, you can get its name from the title attribute.

Getting Cells from the Sheets

Once you have a Worksheet object, you can access a Cell object by its name. Enter the following into the interactive shell:

>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> sheet = wb['Sheet1'] # Get a sheet from the workbook.
>>> sheet['A1'] # Get a cell from the sheet.
<Cell 'Sheet1'.A1>
>>> sheet['A1'].value # Get the value from the cell.
datetime.datetime(2015, 4, 5, 13, 34, 2)
>>> c = sheet['B1'] # Get another cell from the sheet.
>>> c.value
'Apples'
>>> # Get the row, column, and value from the cell.
>>> 'Row %s, Column %s is %s' % (c.row, c.column, c.value)
'Row 1, Column B is Apples'
>>> 'Cell %s is %s' % (c.coordinate, c.value)
'Cell B1 is Apples'
>>> sheet['C1'].value
73

The Cell object has a value attribute that contains, unsurprisingly, the value stored in that cell. Cell objects also have row , column , and coordinate attributes that provide location information for the cell.

Here, accessing the value attribute of our Cell object for cell B1 gives us the string 'Apples' . The row attribute gives us the integer 1 , the column attribute gives us 'B' , and the coordinate attribute gives us 'B1' .

[Jul 27, 2020] Statistica in Python

Jul 27, 2020 | zetcode.com

Ebooks Openpyxl tutorial

last modified July 6, 2020

In this tutorial we show how to work with Excel files in Python using openpyxl library.

https://www.facebook.com/v2.8/plugins/like.php?action=like&app_id=&channel=https%3A%2F%2Fstaticxx.facebook.com%2Fx%2Fconnect%2Fxd_arbiter%2F%3Fversion%3D46%23cb%3Df8ad26aeac378c%26domain%3Dzetcode.com%26origin%3Dhttp%253A%252F%252Fzetcode.com%252Ffaa3837d104018%26relation%3Dparent.parent&container_width=730&href=http%3A%2F%2Fzetcode.com%2Fpython%2Fopenpyxl%2F&layout=button_count&locale=en_US&sdk=joey&share=true&show_faces=true

https://platform.twitter.com/widgets/tweet_button.c4b33f07650267db9f8a72eaac551cac.en.html#dnt=false&id=twitter-widget-0&lang=en&original_referer=http%3A%2F%2Fzetcode.com%2Fpython%2Fopenpyxl%2F&size=m&text=Openpyxl%20tutorial%20-%20read%2C%20write%20Excel%20xlsx%20files%20in%20Python&time=1595883144162&type=share&url=http%3A%2F%2Fzetcode.com%2Fpython%2Fopenpyxl%2F Openpyxl

The openpyxl is a Python library to read and write Excel 2010 xlsx/xlsm/xltx/xltm files.

https://bf66057ac32c57f598df4bfd31d085c6.safeframe.googlesyndication.com/safeframe/1-0-37/html/container.html Excel xlsx

In this tutorial we work with xlsx files. The xlsx is a file extension for an open XML spreadsheet file format used by Microsoft Excel. The xlsm files support macros. The xls format is a proprietary binary format while xlsx is based on Office Open XML format.

$ sudo pip3 install openpyxl

We install openpyxl with the pip3 tool.

Openpyxl create new file

In the first example, we create a new xlsx file with openpyxl .

write_xlsx.py
#!/usr/bin/env python

from openpyxl import Workbook
import time

book = Workbook()
sheet = book.active

sheet['A1'] = 56
sheet['A2'] = 43

now = time.strftime("%x")
sheet['A3'] = now

book.save("sample.xlsx")

In the example, we create a new xlsx file. We write data into three cells.

from openpyxl import Workbook

From the openpyxl module, we import the Workbook class. A workbook is the container for all other parts of the document.

book = Workbook()

We create a new workbook. A workbook is always created with at least one worksheet.

sheet = book.active

We get the reference to the active sheet.

sheet['A1'] = 56
sheet['A2'] = 43

We write numerical data to cells A1 and A2.

now = time.strftime("%x")
sheet['A3'] = now

We write current date to the cell A3.

book.save("sample.xlsx")

We write the contents to the sample.xlsx file with the save method.

New file
Figure: New file
Openpyxl write to a cell

There are two basic ways to write to a cell: using a key of a worksheet such as A1 or D3, or using a row and column notation with the cell method.

write2cell.py
#!/usr/bin/env python

from openpyxl import Workbook

book = Workbook()
sheet = book.active

sheet['A1'] = 1
sheet.cell(row=2, column=2).value = 2

book.save('write2cell.xlsx')

In the example, we write two values to two cells.

sheet['A1'] = 1

Here, we assing a numerical value to the A1 cell.

sheet.cell(row=2, column=2).value = 2

In this line, we write to cell B2 with the row and column notation.

Openpyxl append values

With the append method, we can append a group of values at the bottom of the current sheet.

appending_values.py
#!/usr/bin/env python

from openpyxl import Workbook

book = Workbook()
sheet = book.active

rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)

for row in rows:
    sheet.append(row)

book.save('appending.xlsx')

In the example, we append three columns of data into the current sheet.

rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)

The data is stored in a tuple of tuples.

for row in rows:
    sheet.append(row)

We go through the container row by row and insert the data row with the append method.

Openpyxl read cell

In the following example, we read the previously written data from the sample.xlsx file.

read_cells.py
#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sample.xlsx')

sheet = book.active

a1 = sheet['A1']
a2 = sheet['A2']
a3 = sheet.cell(row=3, column=1)

print(a1.value)
print(a2.value) 
print(a3.value)

The example loads an existing xlsx file and reads three cells.

book = openpyxl.load_workbook('sample.xlsx')

The file is opened with the load_workbook method.

a1 = sheet['A1']
a2 = sheet['A2']
a3 = sheet.cell(row=3, column=1)

We read the contents of the A1, A2, and A3 cells. In the third line, we use the cell method to get the value of A3 cell.

$ ./read_cells.py 
56
43
10/26/16

This is the output of the example.

Openpyxl read multiple cells

We have the following data sheet:

Items
Figure: Items

We read the data using a range operator.

read_cells2.py
#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('items.xlsx')

sheet = book.active

cells = sheet['A1': 'B6']

for c1, c2 in cells:
    print("{0:8} {1:8}".format(c1.value, c2.value))

In the example, we read data from two columns using a range operation.

cells = sheet['A1': 'B6']

In this line, we read data from cells A1 - B6.

for c1, c2 in cells:
    print("{0:8} {1:8}".format(c1.value, c2.value))

The format() function is used for neat output of data on the console.

$ ./read_cells2.py 
Items    Quantity
coins          23
chairs          3
pencils         5
bottles         8
books          30

This is the output of the program.

Openpyxl iterate by rows

The iter_rows method return cells from the worksheet as rows.

iterating_by_rows.py
#!/usr/bin/env python

from openpyxl import Workbook

book = Workbook()
sheet = book.active

rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)

for row in rows:
    sheet.append(row)
    
for row in sheet.iter_rows(min_row=1, min_col=1, max_row=6, max_col=3):
    for cell in row:
        print(cell.value, end=" ")
    print()    

book.save('iterbyrows.xlsx')

The example iterates over data row by row.

for row in sheet.iter_rows(min_row=1, min_col=1, max_row=6, max_col=3):

We provide the boundaries for the iteration.

$ ./iterating_by_rows.py 
88 46 57 
89 38 12 
23 59 78 
56 21 98 
24 18 43 
34 15 67

This is the output of the example.

Openpyxl iterate by columns

The iter_cols method return cells from the worksheet as columns.

iterating_by_columns.py
#!/usr/bin/env python

from openpyxl import Workbook

book = Workbook()
sheet = book.active

rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)

for row in rows:
    sheet.append(row)
    
for row in sheet.iter_cols(min_row=1, min_col=1, max_row=6, max_col=3):
    for cell in row:
        print(cell.value, end=" ")
    print()    

book.save('iterbycols.xlsx')

The example iterates over data column by column.

$ ./iterating_by_columns.py 
88 89 23 56 24 34 
46 38 59 21 18 15 
57 12 78 98 43 67

This is the output of the example.

Statistics

For the next example, we need to create a xlsx file containing numbers. For instance, we have created 25 rows of numbers in 10 columns with the RANDBETWEEN() function.

mystats.py
#!/usr/bin/env python

import openpyxl
import statistics as stats

book = openpyxl.load_workbook('numbers.xlsx', data_only=True)

sheet = book.active

rows = sheet.rows

values = []

for row in rows:
    for cell in row:
        values.append(cell.value)

print("Number of values: {0}".format(len(values)))
print("Sum of values: {0}".format(sum(values)))
print("Minimum value: {0}".format(min(values)))
print("Maximum value: {0}".format(max(values)))
print("Mean: {0}".format(stats.mean(values)))
print("Median: {0}".format(stats.median(values)))
print("Standard deviation: {0}".format(stats.stdev(values)))
print("Variance: {0}".format(stats.variance(values)))

In the example, we read all values from the sheet and compute some basic statistics.

import statistics as stats

The statistics module is imported to provide some statistical functions, such as median and variance.

book = openpyxl.load_workbook('numbers.xlsx', data_only=True)

Using the data_only option, we get the values from the cells, not the formula.

rows = sheet.rows

We get all the rows of cells that are not empty.

for row in rows:
    for cell in row:
        values.append(cell.value)

In two for loops, we form a list of integer values from the cells.

print("Number of values: {0}".format(len(values)))
print("Sum of values: {0}".format(sum(values)))
print("Minimum value: {0}".format(min(values)))
print("Maximum value: {0}".format(max(values)))
print("Mean: {0}".format(stats.mean(values)))
print("Median: {0}".format(stats.median(values)))
print("Standard deviation: {0}".format(stats.stdev(values)))
print("Variance: {0}".format(stats.variance(values)))

We compute and print mathematical statistics about the values. Some of the functions are built-in, others are imported with the statistics module.

$ ./mystats.py 
Number of values: 312
Sum of values: 15877
Minimum value: 0
Maximum value: 100
Mean: 50.88782051282051
Median: 54.0
Standard deviation: 28.459203819700967
Variance: 809.9262820512821

This is a sample output.

Openpyxl filter & sort data

A sheet has an auto_filter attribute, which allows to set filtering and sorting conditions.

Note that Openpyxl sets the conditions but we must apply them inside the Spreadsheet application.

filter_sort.py
#!/usr/bin/env python

from openpyxl import Workbook

wb = Workbook()
sheet = wb.active

data = [
    ['Item', 'Colour'],
    ['pen', 'brown'],
    ['book', 'black'],
    ['plate', 'white'],
    ['chair', 'brown'],
    ['coin', 'gold'],
    ['bed', 'brown'],
    ['notebook', 'white'],
]

for r in data:
    sheet.append(r)

sheet.auto_filter.ref = 'A1:B8'
sheet.auto_filter.add_filter_column(1, ['brown', 'white'])
sheet.auto_filter.add_sort_condition('B2:B8')

wb.save('filtered.xlsx')

In the example, we create a sheet with items and their colours. We set a filter and a sort condition.

Openpyxl dimensions

To get those cells that actually contain data, we can use dimensions.

dimensions.py
#!/usr/bin/env python

from openpyxl import Workbook

book = Workbook()
sheet = book.active

sheet['A3'] = 39
sheet['B3'] = 19

rows = [
    (88, 46),
    (89, 38),
    (23, 59),
    (56, 21),
    (24, 18),
    (34, 15)
]

for row in rows:
    sheet.append(row)

print(sheet.dimensions)
print("Minimum row: {0}".format(sheet.min_row))
print("Maximum row: {0}".format(sheet.max_row))
print("Minimum column: {0}".format(sheet.min_column))
print("Maximum column: {0}".format(sheet.max_column))

for c1, c2 in sheet[sheet.dimensions]:
    print(c1.value, c2.value)

book.save('dimensions.xlsx')

The example calculates the dimensions of two columns of data.

sheet['A3'] = 39
sheet['B3'] = 19

rows = [
    (88, 46),
    (89, 38),
    (23, 59),
    (56, 21),
    (24, 18),
    (34, 15)
]

for row in rows:
    sheet.append(row)

We add data to the worksheet. Note that we start adding from the third row.

print(sheet.dimensions)

The dimensions property returns the top-left and bottom-right cell of the area of non-empty cells.

print("Minimum row: {0}".format(sheet.min_row))
print("Maximum row: {0}".format(sheet.max_row))

Witht the min_row and max_row properties, we get the minimum and maximum row containing data.

print("Minimum column: {0}".format(sheet.min_column))
print("Maximum column: {0}".format(sheet.max_column))

With the min_column and max_column properties, we get the minimum and maximum column containing data.

for c1, c2 in sheet[sheet.dimensions]:
    print(c1.value, c2.value)

We iterate through the data and print it to the console.

$ ./dimensions.py 
A3:B9
Minimum row: 3
Maximum row: 9
Minimum column: 1
Maximum column: 2
39 19
88 46
89 38
23 59
56 21
24 18
34 15

This is the output of the example.

https://bf66057ac32c57f598df4bfd31d085c6.safeframe.googlesyndication.com/safeframe/1-0-37/html/container.html Sheets

Each workbook can have multiple sheets.

Sheets
Figure: Sheets

Let's have a workbook with these three sheets.

sheets.py
#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sheets.xlsx')

print(book.get_sheet_names())

active_sheet = book.active
print(type(active_sheet))

sheet = book.get_sheet_by_name("March")
print(sheet.title)

The program works with Excel sheets.

print(book.get_sheet_names())

The get_sheet_names method returns the names of available sheets in a workbook.

active_sheet = book.active
print(type(active_sheet))

We get the active sheet and print its type to the terminal.

sheet = book.get_sheet_by_name("March")

We get a reference to a sheet with the get_sheet_by_name() method.

print(sheet.title)

The title of the retrieved sheet is printed to the terminal.

$ ./sheets.py 
['January', 'February', 'March']
<class 'openpyxl.worksheet.worksheet.Worksheet'>
March

This is the output of the program.

sheets2.py
#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sheets.xlsx')

book.create_sheet("April")

print(book.sheetnames)

sheet1 = book.get_sheet_by_name("January")
book.remove_sheet(sheet1)

print(book.sheetnames)

book.create_sheet("January", 0)
print(book.sheetnames)

book.save('sheets2.xlsx')

In this example, we create a new sheet.

book.create_sheet("April")

A new sheet is created with the create_sheet method.

print(book.sheetnames)

The sheet names can be shown with the sheetnames attribute as well.

book.remove_sheet(sheet1)

A sheet can be removed with the remove_sheet method.

book.create_sheet("January", 0)

A new sheet can be created at the specified position; in our case, we create a new sheet at position with index 0.

$ ./sheets2.py 
['January', 'February', 'March', 'April']
['February', 'March', 'April']
['January', 'February', 'March', 'April']

This is the output of the program.

It is possible to change the background colour of a worksheet.

sheets3.py
#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sheets.xlsx')

sheet = book.get_sheet_by_name("March")
sheet.sheet_properties.tabColor = "0072BA"

book.save('sheets3.xlsx')

The example modifies the background colour of the sheet titled "March".

sheet.sheet_properties.tabColor = "0072BA"

We change the tabColor property to a new colour.

Background colour of a worksheet
Figure: Background colour of a worksheet

The background colour of the third worksheet has been changed to some blue colour.

Merging cells

Cells can be merged with the merge_cells method and unmerged with the unmerge_cells method. When we merge cells, all cells but the top-left one are removed from the worksheet.

merging_cells.py
#!/usr/bin/env python

from openpyxl import Workbook
from openpyxl.styles import Alignment

book = Workbook()
sheet = book.active

sheet.merge_cells('A1:B2')

cell = sheet.cell(row=1, column=1)
cell.value = 'Sunny day'
cell.alignment = Alignment(horizontal='center', vertical='center')

book.save('merging.xlsx')

In the example, we merge four cells: A1, B1, A2, and B2. The text in the final cell is centered.

from openpyxl.styles import Alignment

In order to center a text in the final cell, we use the Alignment class from the openpyxl.styles module.

sheet.merge_cells('A1:B2')

We merge four cells with the merge_cells method.

cell = sheet.cell(row=1, column=1)

We get the final cell.

cell.value = 'Sunny day'
cell.alignment = Alignment(horizontal='center', vertical='center')

We set text to the merged cell and update its alignment.

Merged cells
Figure: Merged cells
Openpyxl freeze panes

When we freeze panes, we keep an area of a worksheet visible while scrolling to another area of the worksheet.

freezing.py
#!/usr/bin/env python

from openpyxl import Workbook
from openpyxl.styles import Alignment

book = Workbook()
sheet = book.active

sheet.freeze_panes = 'B2'

book.save('freezing.xlsx')

The example freezes panes by the cell B2.

sheet.freeze_panes = 'B2'

To freeze panes, we use the freeze_panes property.

Openpyxl formulas

The next example shows how to use formulas. The openpyxl does not do calculations; it writes formulas into cells.

formulas.py
#!/usr/bin/env python

from openpyxl import Workbook

book = Workbook()
sheet = book.active

rows = (
    (34, 26),
    (88, 36),
    (24, 29),
    (15, 22),
    (56, 13),
    (76, 18)
)

for row in rows:
    sheet.append(row)

cell = sheet.cell(row=7, column=2)
cell.value = "=SUM(A1:B6)"
cell.font = cell.font.copy(bold=True)

book.save('formulas.xlsx')

In the example, we calculate the sum of all values with the SUM() function and style the output in bold font.

rows = (
    (34, 26),
    (88, 36),
    (24, 29),
    (15, 22),
    (56, 13),
    (76, 18)
)

for row in rows:
    sheet.append(row)

We create two columns of data.

cell = sheet.cell(row=7, column=2)

We get the cell where we show the result of the calculation.

cell.value = "=SUM(A1:B6)"

We write a formula into the cell.

cell.font = cell.font.copy(bold=True)

We change the font style.

Calculating the sum of values
Figure: Calculating the sum of values
Openpyxl images

In the following example, we show how to insert an image into a sheet.

write_image.py
#!/usr/bin/env python

from openpyxl import Workbook
from openpyxl.drawing.image import Image

book = Workbook()
sheet = book.active

img = Image("icesid.png")
sheet['A1'] = 'This is Sid'

sheet.add_image(img, 'B2')

book.save("sheet_image.xlsx")

In the example, we write an image into a sheet.

from openpyxl.drawing.image import Image

We work with the Image class from the openpyxl.drawing.image module.

img = Image("icesid.png")

A new Image class is created. The icesid.png image is located in the current working directory.

sheet.add_image(img, 'B2')

We add a new image with the add_image method.

Openpyxl Charts

The openpyxl library supports creation of various charts, including bar charts, line charts, area charts, bubble charts, scatter charts, and pie charts.

According to the documentation, openpyxl supports chart creation within a worksheet only. Charts in existing workbooks will be lost.

create_bar_chart.py
#!/usr/bin/env python

from openpyxl import Workbook
from openpyxl.chart import (
    Reference,
    Series,
    BarChart
)

book = Workbook()
sheet = book.active

rows = [
    ("USA", 46),
    ("China", 38),
    ("UK", 29),
    ("Russia", 22),
    ("South Korea", 13),
    ("Germany", 11)
]

for row in rows:
    sheet.append(row)
    
data = Reference(sheet, min_col=2, min_row=1, max_col=2, max_row=6)
categs = Reference(sheet, min_col=1, min_row=1, max_row=6)

chart = BarChart()
chart.add_data(data=data)
chart.set_categories(categs)

chart.legend = None
chart.y_axis.majorGridlines = None
chart.varyColors = True
chart.title = "Olympic Gold medals in London"

sheet.add_chart(chart, "A8")    

book.save("bar_chart.xlsx")

In the example, we create a bar chart to show the number of Olympic gold medals per country in London 2012.

from openpyxl.chart import (
    Reference,
    Series,
    BarChart
)

The openpyxl.chart module has tools to work with charts.

book = Workbook()
sheet = book.active

A new workbook is created.

rows = [
    ("USA", 46),
    ("China", 38),
    ("UK", 29),
    ("Russia", 22),
    ("South Korea", 13),
    ("Germany", 11)
]

for row in rows:
    sheet.append(row)

We create some data and add it to the cells of the active sheet.

data = Reference(sheet, min_col=2, min_row=1, max_col=2, max_row=6)

With the Reference class, we refer to the rows in the sheet that represent data. In our case, these are the numbers of olympic gold medals.

categs = Reference(sheet, min_col=1, min_row=1, max_row=6)

We create a category axis. A category axis is an axis with the data treated as a sequence of non-numerical text labels. In our case, we have text labels representing names of countries.

chart = BarChart()
chart.add_data(data=data)
chart.set_categories(categs)

We create a bar chart and set it data and categories.

chart.legend = None
chart.y_axis.majorGridlines = None

Using legend and majorGridlines attributes, we turn off the legends and major grid lines.

chart.varyColors = True

Setting varyColors to True , each bar has a different colour.

chart.title = "Olympic Gold medals in London"

A title is set for the chart.

sheet.add_chart(chart, "A8")

The created chart is added to the sheet with the add_chart method.

Bar chart
Figure: Bar chart

In this tutorial, we have worked with the openpyxl library. We have read data from an Excel file, written data to an Excel file.

https://bf66057ac32c57f598df4bfd31d085c6.safeframe.googlesyndication.com/safeframe/1-0-37/html/container.html

Visit Python tutorial or list all Python tutorials .

https://bf66057ac32c57f598df4bfd31d085c6.safeframe.googlesyndication.com/safeframe/1-0-37/html/container.html

© 2007 - 2020 Jan Bodnar admin(at)zetcode.com

[Jul 21, 2020] Python Automation Cookbook by Jaime Buelta

Jul 21, 2020 | amazon.com

How to do it...

  1. Import the csv module:
>>> import csv
  1. Open the file, create a reader, and iterate through it to show the tabular data of all rows (only three rows are shown):
>>> with open('top_films.csv') as file:
...   data = csv.reader(file)
...   for row in data:
...       print(row)
...
['Rank', 'Admissions\n(millions)', 'Title (year) (studio)', 'Director(s)']
['1', '225.7', 'Gone With the Wind (1939)\xa0(MGM)', 'Victor Fleming, George Cukor, Sam Wood']
['2', '194.4', 'Star Wars (Ep. IV: A New Hope) (1977)\xa0(Fox)', 'George Lucas']
...
['10', '118.9', 'The Lion King (1994)\xa0(BV)', 'Roger Allers, Rob Minkoff']
  1. Open the file and use DictReader to structure the data, including the header:
>>> with open('top_films.csv') as file:
...     data = csv.DictReader(file)
...     structured_data = [row for row in data]
...
>>> structured_data[0]
OrderedDict([('Rank', '1'), ('Admissions\n(millions)', '225.7'), ('Title (year) (studio)', 'Gone With the Wind (1939)\xa0(MGM)'), ('Director(s)', 'Victor Fleming, George Cukor, Sam Wood')])
  1. Each of the items in structured_data is a full dictionary that contains each of the values:
>>> structured_data[0].keys()
odict_keys(['Rank', 'Admissions\n(millions)', 'Title (year) (studio)', 'Director(s)'])
>>> structured_data[0]['Rank']
'1'
>>> structured_data[0]['Director(s)']
'Victor Fleming, George Cukor, Sam Wood'

[Jul 21, 2020] Financial Modeling in Excel For Dummies

Jul 21, 2020 | www.amazon.com
Understanding why you may want to use a named range

You don't have to include named ranges in a financial model, and some of the best financial models don't use them at all. Those who haven't used them before sometimes struggle to see the benefits of including them in financial models. Most of the time, named ranges aren't really necessary, but there are a few reasons why you should consider using them in a financial model:

[Jul 20, 2020] Why Excel Users Should Learn Python by Rafael Knuth

October 9, 2018 | www.datasciencecentral.com

Latest update: November 16, 2018

Microsoft Excel has been around for over 30 years now, and chances are it's not going to change in the foreseeable future. In fact, Excel is facing immense competition from challengers such as Google Spreadsheets and well-funded start-ups like Airtable, which are both going after Excel's massive user base of approximately 500 million worldwide. Tech-savvy small and mid-sized businesses embrace innovative alternatives to Excel. However, making a dent in the large enterprise space is a whole different ballgame. It's nearly impossible to rip and replace Excel along with Office 365 and the massive underlying infrastructure, which is serving ever-growing workloads. Not that it can't happen, but it might take decades.

The world runs on Excel, whether we like it or not

I recently spoke to a large consultancy which serves an enterprise with revenues north of 100 billion USD annually: "They run entirely on Excel." as one of the consultancy's top executive jovially confessed. This is not an exception. Let's take as an example a bank holding trillions of USD assets: How do they manage their credit risks? The chances are that their thousands of risk managers are holding one of the most valuable assets buried in hundreds of thousands, if not even millions of spreadsheets, scattered across the entire organization.

This is indeed no exception. Insurance companies? Pharmaceuticals? Retail? I would make an educated guess here, that very likely trillions of USD in annual revenues globally literally run through billions of spreadsheets.

How could things even get this far? Maybe Excel is being misused as a temporary quick-fix which has never matured into a permanent, enterprise-grade solution.

Enterprise-grade, Mezzogiorno style: Living with temporary solutions for ever

Southern Italy, commonly referred to Mezzogiorno, is amongst the most beautiful, yet poorest and worst-governed parts of Europe. A good friend of mine, who lives in Sicily, enjoys an enviably vast number of sunny days, beautiful sceneries and mouth-watering cuisine. To wine aficionados' delight, some of the world's best wineries are located in this region. On the flip side though, daily life can be harsh. "This whole region is depending on temporary, quick-fix type of infrastructure, which is never going to be replaced with permanent solutions. It's a miracle if things don't break." so said my dear friend over a bottle of red wine from east-Sicily. But what if they do?

It might sound like a gross exaggeration at first glance. But are IT departments acting that differently from officials in Sicily operating at the verge of possible a collapse? What could go wrong? A lot. Oracle, historically a life-or-death Microsoft competitor, put together a list of biggest business failures due to Excel misuse . The so-called "London Whale" incident, for example, caused JPMorgan a whopping 6.5 billion USD in losses and fines. The bank's risk analysts miscalculated the downside of its synthetic credit portfolio as a result of excessive formula-shuffling in Excel.

Excel was introduced in 1987. People were using fax machines back then

I bought my first laptop in 1992. It came with a back then great feature: It allowed me to send word documents via my telephone line to the recipient's fax machine. That was a great invention, which allowed me as a former copywriter to submit my work to my boss, who would manually review the printouts and send them back via fax. Excel was five years old back then, and I guess most people have not even heard of the internet yet.

I started using Excel approximately five years ago, and I quickly turned into a power user. To Excel's rescue, I have to admit that I am still joyfully using it on a daily basis. It has a wealth of features, many of which even seasoned Excel users are not aware of. However, we should take a hard look at Excel: Is this still the all-purpose solution for corporate data? It's certainly not.

Excel's role in the enterprise needs to be redefined

First and foremost: Is Excel the right place to hold mission-critical data? Large enterprises tend to have highly formalized, very tight IT security policies. However, on the other hand, the misuse of spreadsheets as the primary engine for multi-billion USD lines of businesses is entirely out of control. This is very much like owning a house within a gated community, protected by various alarm systems, and then handing over the keys to your teenage kids over the weekend – leaving a richly equipped bar unlocked and your jewelry and life's savings in cash stuffed into a cardboard box on the kitchen table.

However, what other options do business users have?

It's hard to imagine a corporate world without spreadsheets. Very much like it's hard to imagine a teenage house party without their parents' wine & whiskey collection up for grabs to everyone who shows up at the doorstep.

Your bank though, in case they manage their trillion USD crown jewels via spreadsheets, does so quietly. Consultancies helping large enterprises overcome their spreadsheet-addiction act just as secretively as the Betty Ford Center trying to protect their rich and famous clientele from reputational damage. The stakes are high.

Python is for data scientists. Is it?

Most companies I came across still bucket their knowledge-working staff into two categories: those with coding skills and those without them. Due to Evans Data Corporation EDC, there are roughly 23 million software developers worldwide. How many data scientists are out there? One can only estimate, for example, based on the total Jupyter project user base of presently 3 million people. Even if we round up the numbers generously to 30 million people with programming skills, this still represents less than 1% of the world's total adult population. This should ring everyone's alarm bells. Literally, everyone's.

Lack of programming skills is nothing else but illiteracy

Around the time when book production started to grow, roughly 12% of the adult world population was able to read and write. Today, around 86% of all adults worldwide are literates. At the same time, less than 1% of the total adult population is capable of reading and writing code.

"While the earliest forms of written communication date back to about 3,500-3,000 BCE, literacy remained for centuries a very restricted technology closely associated with the exercise of power." Ourworldindata.org

Literacy is associated with the exercise of power

The accumulation of wealth in the hands of few individuals and corporations can be well observed in rankings provided by Forbes and Fortune. An ever-growing amount of fortunes is being made of technology. However, is there a connection between technology literacy and the accumulation of wealth and power? There is not much hard evidence to answer that question, as this subject is not well researched yet.

In spite of a lack of research: What assumptions can we make regarding coding skills in the business world?

Want to empower business users? Teach them to code

What had happened, if Amazon Alexa was invented in the 16th century, when only less than 12% of the population was able to read and write? There would probably be little incentive for people to learn how to communicate in written form. Why invest time and effort into mastering the skill of writing tiny, little signs on a piece of paper when you can yell: "Alexa! What's the weather gonna be in LA tomorrow?" Of course, it's convenient. However, convenience comes at a high price. Also, as users get accustomed to the comfort of yelling and mumbling their commands into a smart device, they grow increasingly scared of the foundational skill of reading and writing: "It must be very tough to master "

Is coding in Python hard? Not harder than writing in natural language

Python is a high-level language. It's meant to be readable by humans and not surprisingly most Python tutorials start with writing print("hello world"). Guess what print("hello world") does? It prints the two worlds hello world. The process of displaying those two words on your computer screen is abstracted away. This is why Python is also called a declarative language. You, the author of your code, declare what you want the program to do: print hello world, perform a calculation or visualize data. The rest is being taken care of. An imperative programming language lets you describe the control flow, meaning: You have to write how exactly the program is going to execute printing hello world, perform a calculation or visualize data. By comparison, this is like driving a car with an automatic gear shift (declarative) vs. one with a manual gearbox (imperative).

High level, declarative languages are making programming incredibly easy. Yet, less than 1% of the world's adult population is capable of writing and reading code. Not surprisingly, businesses start discovering Python as an alternative to Excel for their mission-critical workloads. JPMorgan just recently made headlines with their effort to teach their analysts to code, very likely thousands of employees across the entire organization. Yes, the same JPMorgan which lost 6.5 billion USD in the course of the "London Whale" disaster. Maybe they learned their lessons that not everything that can be done in Excel should also be done in Excel?

If you have ever written an Excel function, you can do the same in Python and much more

Getting started with Python is just as easy as picking up Excel. Every Excel user knows how easy it is to sum up numbers in a column. Just write =SUM and then select the cells you want to sum up. In case you want to sum up cells which meet specific criteria, you can write an =IF or =IFS statement. All this can be done in Python as well. As an advanced Excel user, you might ask yourself: "Why should I learn Python if I already know how to do the same in Excel?" That's an excellent question, and I kept asking it myself back a few years ago. Here's my answer:

#1 Mastering Excel is harder than wrapping your head around Python

Excel is not just one product, but three products in one:

If you want to work with spreadsheets, clean up your data and perform advanced calculations on multiple tables connected into one data model - sure you can do that in Excel (surprisingly, most advanced Excel users are not aware of that). However, you have to juggle around with three different UIs and three separate languages: Excel's regular function language, M-Language in Power Query and DAX (Data Analysis Expressions) in Power Pivot. Oh, did I mention VBA?

I find it more convenient to do all of the above in one language with one consistent syntax and semantics. By the way, why do I prefer Python over R? Because Python is easier to learn. Ease of use matters a lot to me.

#2 People around you can understand what you write in Python

Data scientists are often left clueless when business users throw their Excel spreadsheets over the fence. Even as a business user, it's hard to read and interpret someone else's spreadsheet. Now imagine a data scientist, who has never worked with Excel spreadsheets.

Many data analysts and data scientists use Jupyter Notebooks. I call it "Word for Geeks" because Jupyter Notebooks allows you to tell your story in natural language using headlines and body text like in a Word document, and in addition to that you can write and execute code in the same place.

Once I discovered Jupyter Notebooks, I instantly fell in love with it, because it allows me to communicate my ideas in natural language based on data I prepare, analyze and visualize in Python – all in one document. If I hit the wall, I can share my Jupyter Notebook with a data scientist, and he can just read through my document, understand my data analysis and pick up where I left. Once he's done with his work, I can read through it and see what he has done.

Business users and data scientists finally united in love!

#3 Python is open source, and the community is very active and creative

I can't think of a data related problem that I would not be able to solve in Python in conjunction with one of its libraries. Whether it's extracting data from Excel, cleansing data, performing calculations, visualizing data or utilizing various APIs - there is certainly a package out there.

In a future not so distant from now, I envision analysts within banks, for example, utilizing Python packages built internally specifically for their proprietary risk assessment. No more murky Excel functions nobody other than its creator comprehends. Just one repository with one centrally maintained set of Python packages for the bank's internal risk analysis and assessment.

"But didn't you say open source?" some concerned voices might ask. Open source should not be misinterpreted as "open door". You can apply any security level you want and thus ensure that your packages with your proprietary risk models are only available to a selected group of people within your organization and nobody else.

#4 Utility grade is where Python shines and not Excel

There is a growing number of data analytics and data science PaaS solutions which are built with petabyte-scale, highly collaborative use cases in mind such as Cloudera Altus, Microsoft Kusto and T-Systems Data Science Workstation.

If your business requires you to crunch vast amounts of data in near real time, and if you are increasingly faced with semi-structured and unstructured data, you are better served with Python, Jupyter Notebooks and a big data cluster within one of the aforementioned PaaS solutions.

#5 You want to keep your job

Due to recent McKinsey research, around 300 million workers need to get retrained to meet the requirements of job markets by 2030. I wrote two blog posts on transitioning from a line of business career into data analytics and data science in my previous posts.

Programming is all about automation. The more companies automate their processes, the less they need to rely on people performing mundane tasks. "Do you want to keep your job?" it all comes down to this question. If your answer is "Yes," you should consider learning to code. Python is a good starting point.

I work as a data literacy expert, and I cater to large companies in Europe and the US. Maybe you have questions I didn't answer in my write-up? Please leave a comment or reach out to me via email rafael@knuthconcepts.com or LinkedIn .


me title=



DSC Podcast


Most Popular Content on DSC

To not miss this type of content in the future, subscribe to our newsletter.

Other popular resources

Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

Follow us : Twitter | Facebook

Views: 15267

Tags: Analytics , Career , Data , Excel , Python , Science , Transition

Like 9 members like this

Share

https://www.datasciencecentral.com/profiles/blogs/why-excel-users-should-learn-python

https://www.facebook.com/v2.3/plugins/like.php?app_id=259683271111061&channel=https%3A%2F%2Fstaticxx.facebook.com%2Fx%2Fconnect%2Fxd_arbiter%2F%3Fversion%3D46%23cb%3Df3966dcd4762508%26domain%3Dwww.datasciencecentral.com%26origin%3Dhttps%253A%252F%252Fwww.datasciencecentral.com%252Ff4cd2f35728768%26relation%3Dparent.parent&container_width=0&href=https%3A%2F%2Fwww.datasciencecentral.com%2Fxn%2Fdetail%2F6448529%3ABlogPost%3A766500&layout=button_count&locale=en_US&sdk=joey&show_faces=false&width=450

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Ramesh Gopal on October 27, 2018 at 7:52pm

Hi Rafael,

I agree with what you're saying, and also what the other commenter has said.

Many consultants, IT experts and even risk managers have pointed out for decades about how fragile spreadsheets are, and the risks posed to businesses who rely on Excel (or spreadsheets in general), without adopting proper safeguards.

However, Excel is the ultimate business software tool. Its very weaknesses, which you clearly point out in your article, are also its enduring strengths. It will be near impossible to dislodge Excel from its current position as it fills an important niche or gap in enterprises as THE information tool which runs business workgroups, departments and divisions, especially when corporate IT takes ages to deliver functionality end-users want.

And trying to teach programming to existing workers isn't likely to succeed in a big way for several reasons.

One, not everyone has a brain wired to "get" code. Studies have shown that you need a certain type of abstract thinking capabilities to understand programming.

Two, those in operations who are taught to code will still find coding time-consuming and frustrating if the skill is not exercised day-in, day-out. Developers do this as part of their jobs, but operations people have real work to do, in addition to using Excel as a data and model repository.

Three, daily work pressures and deadlines also mean that enterprises aren't likely to send their personnel for programming training en masse.

However, the future is bright for the coming generation. They're likely learning programming in school and such fundamental skills and concepts, learnt at a young age, will typically persist when they enter the workforce.

So what is called for today is pragmatism, and figuring out how to co-exist with spreadsheets, while trying to minimize the damage done by spreadsheets gone bad.

Comment by Ulf Morys on October 14, 2018 at 12:15am

Hi Raffael, great post - and I absolutely second the observations.

One thing, though, merits to be mentioned explixitly: the core of the Excel problem is simply the fact that, at no point, there is a clear seperation of data from logic in Excel sheets. You see results - which is usually good. But if you don't see what you expected, it's very tedious to identify if (a) there really is an unexpected result or (b) there is a problem with the data or (c) a problem with the logic.

The reason is obvious and simple: any Excel cell may hold either data...or a function and thus logic.

I have stopped counting the sheets and models I have seen where just a single, accidental copy&paste or data entry has produced huge confusion and loss of time.

The beauty of Python (...and Pandas, I suppose) in combination with Jupyter Notebook (...we seem to have fallen for the same solution): You start out with a clear, pure data set, you apply a clear logic (aka run the data thru the code) and end up with data again.

The approach separates clearly data from logic - and the logic becomes better understandable and more clearly documented and reproducible with the Notebook.

Really, if I had to express the advantage of Python over Excel in a single statement it would be just this: Separation of logic from data.

Ramesh Gopal on October 27, 2018 at 7:52pm

Hi Rafael,

I agree with what you're saying, and also what the other commenter has said.

Many consultants, IT experts and even risk managers have pointed out for decades about how fragile spreadsheets are, and the risks posed to businesses who rely on Excel (or spreadsheets in general), without adopting proper safeguards.

However, Excel is the ultimate business software tool. Its very weaknesses, which you clearly point out in your article, are also its enduring strengths. It will be near impossible to dislodge Excel from its current position as it fills an important niche or gap in enterprises as THE information tool which runs business workgroups, departments and divisions, especially when corporate IT takes ages to deliver functionality end-users want.

And trying to teach programming to existing workers isn't likely to succeed in a big way for several reasons.

One, not everyone has a brain wired to "get" code. Studies have shown that you need a certain type of abstract thinking capabilities to understand programming.

Two, those in operations who are taught to code will still find coding time-consuming and frustrating if the skill is not exercised day-in, day-out. Developers do this as part of their jobs, but operations people have real work to do, in addition to using Excel as a data and model repository.

Three, daily work pressures and deadlines also mean that enterprises aren't likely to send their personnel for programming training en masse.

However, the future is bright for the coming generation. They're likely learning programming in school and such fundamental skills and concepts, learnt at a young age, will typically persist when they enter the workforce.

So what is called for today is pragmatism, and figuring out how to co-exist with spreadsheets, while trying to minimize the damage done by spreadsheets gone bad.

Ulf Morys on October 14, 2018 at 12:15am

Hi Raffael, great post - and I absolutely second the observations.

One thing, though, merits to be mentioned explixitly: the core of the Excel problem is simply the fact that, at no point, there is a clear seperation of data from logic in Excel sheets. You see results - which is usually good. But if you don't see what you expected, it's very tedious to identify if (a) there really is an unexpected result or (b) there is a problem with the data or (c) a problem with the logic.

The reason is obvious and simple: any Excel cell may hold either data...or a function and thus logic.

I have stopped counting the sheets and models I have seen where just a single, accidental copy&paste or data entry has produced huge confusion and loss of time.

The beauty of Pyhton (...and Pandas, I suppose) in combination with Jupyter Notebook (...we seem to have fallen for the same solution): You start out with a clear, pure data set, you apply a clear logic (aka run the data thru the code) and end up with data again.

The approach seperates clearly data from logic - and the logic becomes better understandable and more clearly documented and reproducable with the Notebook.

Really, if I had to express the advantage of Python over Excel in a single statement it would be just this: Seperation of logic from data.

[Jan 05, 2019] Use a formula to determine which cells to format

Notable quotes:
"... If you are apply to group of cell horizontal-wise, make sure the formulas don't contain $ signs that would make the references absolute. ..."
Jan 05, 2019 | www.excelforum.com
  1. Re: Use a formula to determine which cells to format?
    After select the cell (e.g. B1)

    enter formula: =A1>B1 and choose RED

    then add new rule with formula =A1<B1 and choose GREEN from Fill tab.

    If you are apply to group of cell horizontal-wise, make sure the formulas don't contain $ signs that would make the references absolute.

[Jan 03, 2019] Using Lua for working with excel - Stack Overflow

Jan 03, 2019 | stackoverflow.com

Using Lua for working with excel Ask Question 2


Animesh ,Oct 14, 2009 at 12:04

I am planning to learn Lua for my desktop scripting needs. I want to know if there is any documentation available and also if there are all the things needed in the Standard Lib.

uroc ,Oct 14, 2009 at 12:09

You should check out Lua for Windows -- a 'batteries included environment' for the Lua scripting language on Windows

http://luaforwindows.luaforge.net/

It includes the LuaCOM library, from which you can access the Excel COM object.

Try looking at the LuaCOM documentation, there are some Excel examples in that:

http://www.tecgraf.puc-rio.br/~rcerq/luacom/pub/1.3/luacom-htmldoc/

I've only ever used this for very simplistic things. Here is a sample to get you started:

-- test.lua
require('luacom')
excel = luacom.CreateObject("Excel.Application")
excel.Visible = true
wb = excel.Workbooks:Add()
ws = wb.Worksheets(1)

for i=1, 20 do
    ws.Cells(i,1).Value2 = i
end

Animesh ,Oct 14, 2009 at 12:26

Thanks uroc for your quick response. If possible, please let me know of any beginner tutorial or atleast some sample code for using COM programming via Lua. :) – Animesh Oct 14 '09 at 12:26

sagasw ,Oct 16, 2009 at 1:02

More complex code example for lua working with excel:
require "luacom"

excel = luacom.CreateObject("Excel.Application")

local book  = excel.Workbooks:Add()
local sheet = book.Worksheets(1)

excel.Visible = true

for row=1, 30 do
  for col=1, 30 do
    sheet.Cells(row, col).Value2 = math.floor(math.random() * 100)
  end
end


local range = sheet:Range("A1")

for row=1, 30 do
  for col=1, 30 do
    local v = sheet.Cells(row, col).Value2

    if v > 50 then
        local cell = range:Offset(row-1, col-1)

        cell:Select()
        excel.Selection.Interior.Color = 65535
    end
  end
end

excel.DisplayAlerts = false
excel:Quit()
excel = nil

Another example, could add a graph chart.

require "luacom"

excel = luacom.CreateObject("Excel.Application")

local book  = excel.Workbooks:Add()
local sheet = book.Worksheets(1)

excel.Visible = true

for row=1, 30 do
  sheet.Cells(row, 1).Value2 = math.floor(math.random() * 100)
end

local chart = excel.Charts:Add()
chart.ChartType = 4  --  xlLine

local range = sheet:Range("A1:A30")
chart:SetSourceData(range)

Incredulous Monk ,Oct 19, 2009 at 4:17

A quick suggestion: fragments of code will look better if you format them as code (use the little "101 010" button). – Incredulous Monk Oct 19 '09 at 4:17

Professional Excel Development The Definitive Guide to Developing Applications Using Microsoft Excel and VBA Stephen Bullen

XL-Dennis on March 6, 2005
The book is targeting power users and professional developers and if You consider Yourself to be in this group then this book is for you.

It's a well written book, with a high technical level and with a good structure.

But as with every book it has its strong parts as well as weak parts. Weak parts in terms of that some chapters are overviews only and do not give any depth on the subjects they cover.

The chapters that cover best practice are all excellent. Here the authors share all their experience & knowledge by discussion application structures, structure for notation, advanced techniques for userforms and toolbars, error handling, debugging and many other things. The chapters also cover well how to create and use add-ins and so called user defined functions (UDFs).

The chapters that deal with VB 6.0 & Excel are all excellent too. A welcome contribution is that this book leverages many of the good techniques we use in MS VB 6.0 into Excel, especially when it comes to userforms.

The book covers in a nice way how to create & use classes, create & use ActiveX DLL and front-loaders for Excel in MS VB 6.0.

A whole chapter is devoted to Visual Tools for Office System (VSTO) but point also out all the present shortcomings of VSTO.

As with every book nowadays this book also includes a chapter about XML which is well written and with a high technically level. However the chapter is rather short which tend to compress the content.

If the above is what You're looking for then this book is a must have.

The following chapters provide overviews of the subjects:

Worksheet Design

Programming with Database

Data Manipulation techniques

Windows API

Creating XLLs with C#

If You're looking to get more then an introduction or compressed picture on these subject then there exist other books that will give You more.

The book uses an example, PETRAS Timesheet, to exemplify the subject that is covered in each chapter. Some readers will appreciate it very much while other (like me) will skip it. Anyway, PETRAS timesheet is also well worked out like the rest of the book.

In order to work through both the PETRAS Timesheet case and all the examples the book is given You need to have access to the following softwares:

MS Office 2003 Professional (minimum requirement for the VSTO-chapter)

MS Visual Basic 6.0 (no longer available for sale from Microsoft)

Visual Studio.NET (C#)

VSTO (Separate tool that require Visual Studio.NET)

Kind regards,

Dennis Wallentin aka XL-Dennis

[Jun 14, 2015] Technical Analysis in Excel SMA, EMA, Bollinger bands

­ how traders can use Excel to apply technical analysis (TA) to historical market data. This will include computation of some of the most popular technical analysis indicators and implementation of a trading strategy backtesting spreadsheet (in Part III). Backtesting will involve generation of buy and sell signals based on TA indicators and computation of strategy P&L. We'd like to point out upfront that all computations in these articles will be performed using standard Excel functions available in Excel 2011 and later. We will not be using any VBA/custom Excel macros. This is done on purpose to keep spreadsheets simple and functionality understandable by non-programmers.

In the first part of this article series we will create an Excel spreadsheet where we will use formulas some common technical analysis indicators such as: Simple Moving Average, Bollinger Bands, and Exponential Moving Average. We'll explain the formulas and include step-by-step instructions below. In addition, we are providing a spreadsheet we've created by following steps listed in this article so that you can use it for your own market data analysis or as basis for building your own spreadsheets.

Sample Excel File

Excel file (download) containing formulas for calculation of simple moving average, Bollinger Bands, and exponential moving average as described in this post.

Data File

For this example we've got a CSV file with 6 months of hourly SPY data, covering Sep 3, 2013 – Feb 28, 2014. SPY is an ETF tracking S&P500 index. We have nearly 2000 data points in this file. The file contains OHCL price columns, volume, and timestamp column. Disclaimer: this file has been generated using IB Data Downloader.

Data file: historical_data_SPY_1hour_20140301 (text file – to download – right-click and select "Save Linked File As…")

Simple Moving Average Basic Calculation

Simple Moving Average (SMA) is simply the average price over last N number of bars. Let's calculate SMA for the close prices from our sample data file.

We'll be calculating a 20-day moving average based on the SPY close price (column D). Let's add column header "SMA-20" in column G and we type in the following formula value in cell G21 (since row 21 is the first one that has enough data to calculate 20-day SMA):

=SUM(D2:D21)/20

ta_excel_sma_1

After hitting return to save the formula you should see value '164.57' or close to that in cell G21. In order to calculate SMA-20 for all of the remaining cells below – just select cell G21, move cursor over cell and double-click the small square in the lower-right corner of that cell. You should now see values in column G calculated for the remainder of SPY prices.

ta_excel_sma_2

Generalizing SMA Calculation

Now we have calculated 20-day simple moving average values in column G . It's great, but what if we want to calculate 50-day, or 200-day SMA now? Updating formula values every time you want to change SMA range is pretty tedious and error-prone. Let's make our calculation more generic by adding a "length" parameter. We can start off by storing SMA range parameter in a separate cell so that we can reference it in or formula.

Here are the steps we followed to implement a generic SMA calculation in our spreadsheet:

  1. Let's start off by creating a little table on the side where we can store some input parameter values for our indicators.
  2. In cell O1 let's type "Variable Name", in cell P1 let's type "Value".
  3. In cell O2 let's type name of our variable: "PERIOD".
  4. In cell P2 we specify value of the "PERIOD" variable which we'll be using to specify period length for our generalized SMA calculation. Changing this variable will trigger recalculation of SMA with the current period value. Let's use value 14 for now.
  5. Let's type column header value "SMA" in cell H1; column H will contain values for our generic SMA indicator.
  6. In cell H2 enter this formula:

    =SUM(OFFSET(H2,(-1*$P$2+1),-4,$P$2,1))/$P$2

    ta_excel_sma_gen_1

    Let's dissect this formula. We are now using value of our PERIOD variable from cell P2. We had to add $ in front of column and row numbers to freeze reference to cell P2 as we copy SMA formula to other cells in column H. We've also replaced absolute reference to the Close column price range with the OFFSET Excel function. OFFSET returns a range of cells based on the offset in terms of number rows and columns from a given "reference" cell. First parameter is the reference cell (in our case H2 itself), second is an expression calculating the first row of the range based on the value of length parameter ($P$2), 3rd parameter is the column offset to the Close column (-4), negative value represents offset to the left while positive is offset to the right of the reference cell, and the last function parameter with value 1 represents the width of the range returned by OFFSET function, which in our case is just one column: D (CLOSE).

  7. Save the formula in cell in H2 and expand it to the rest of cells in column H by double-clicking the little square in lower-right corner of the cell, or dragging the formula down.
Removing Formula Errors

Now, you will notice that first several rows in the column have error value #REF!. This happens because there are not enough rows in our data set to calculate the SMA value, and the range returned by OFFSET function goes over the edge of the worksheet for some rows. There exists a number of various techniques to hide error values in excel. Some of them involve formulas which return blank or zero values if a cell value contains an error. While this is perfectly valid technique- it complicates cell formulas and makes them hard to read. Instead, we'll use conditional formatting to simply hide error values be changing foreground color to white. To change cell's font color to white and use no error highlighting follow these instructions:

  1. Select columns H-N
  2. In Excel: Home -> Conditional Formatting -> Highlight Cell Rules -> More Rules.
  3. In the "New Formatting Rule" dialog select "Errors" and in "Format with…" select "Custom format", then set Fill color to white and font color to white as well.

ta_excel_error_formatting_1

ta_excel_error_formatting_2

Bollinger Bands Introduction

Bollinger Bands is a simple but useful indicator providing valuable information on historical price volatility of a financial instrument, as well as current price deviation from a moving average. When price moves become more volatile – the bands widen, in the periods of relative calm – they come closer together. The relative position of the current price to the bands can also be used to estimate whether market is overbought or oversold. If the current price is close to or crossed upper band – the price is considered in overbought territory, while price close to/crossed lower band – underlying market is considered oversold.

Basic Calculation

Bollinger Bands indicator could be calculated using either simple moving average or exponential moving average as the basis. Bollinger Bands consists of three data series: moving average (simple or exponential) and two standard deviation (boundary) lines, one above, and one below the moving average, usually at 2 standard deviations from the moving average. Exponential moving average (covered below) gives more weight to the more recent price action, while Simple moving average provides a more stable and less jittery indicator. There are a total of 2 input parameters: 1) moving average period (number of bars), 2) number of standard deviations for the upper band lower bands. In this example we'll use simple moving average we already calculated in column H (see instructions in the section above). All that's remaining is to add columns for upper and lower bands.

  1. We are still using 14-day moving average period value. The first row that has enough data for 14-day SMA is row 15 (since row 1 is used for column header). The upper band will be in column I, so in cell I15 we type the following formula:

    =H15+2*STDEV(D2:D15)

    In this formula we are simply adding two standard deviations of the Close prices from cells D2:D15 to the SMA value.

    ta_excel_bb_1_upper

  2. And for lower band in column I we enter the following:

    =H15-2*STDEV(D2:D15)

    ta_excel_bb_2_lower

    Here the only difference from the previous formula is that we are subtracting two standard deviations from SMA. Excel formula STDEV() calculates standard deviation for a series of values. In this case we are multiplying value by 2 to get 2 standard deviations, and adding/subtracting the result from the moving average to generate the upper/lower band values.

  3. To expand the formulas – just roll over and double-click on a small square in the lower-right corner of the cell to replicate formula for the rest of the data range.

Generalized Bollinger Band Computation

Now , how about generalizing the Bollinger Band formula so that we don't have to update our formulas every time we want to calculate Bollinger bands for different number of standard deviations from MA or when we change moving average length.

  1. Let's add another parameter to our generic variables table on the right of the spreadsheet. Let's type "Std devs:" in cell O3, and 2.0 in P3.
  2. Next, let's add the following formula in I15:

    =H15+$P$3*STDEV(OFFSET(I15,(-1*$P$2+1),-5,$P$2,1))

    In this formula we've replaced 2 with $P$3 – which points to our variable in cell P3 containing number of standard deviations for the bands, and calculate offset based on the PERIOD variable in cell P2.

    ta_excel_bb_3_upper

  3. Similarly in cell J15 let's now enter this formula:

    =H15-$P$3*STDEV(OFFSET(J15,(-1*$P$2+1),-6,$P$2,1))

    The only difference from the formula in the previous step is that we've replaced + after H15 with – (minus), to subtract number of standard deviations from SMA, and we had to change offset to the price columnd , notice -6, instead of -5 in the "cols" parameter to the OFFSET function to refer to column D (CLOSE).

  4. Don't forget to copy new formulas in cells I15 and J15 to the rest of the respective column cells.

    Technical Analysis in Excel - Bollinger Bands

    You can now change values of "PERIOD" and "Std devs" variables in cells P2 & P3, and have SMA and Bollinger Band values automatically recalculated.

[May 29, 2015] Goldman goes astray on Grand Theft Auto and productivity by Matthew C Klein

May 27, 2015 | FT Alphaville

By contrast, consider Microsoft Excel, one of the greatest computer programmes ever created, and among the most integral to global business. It would be ludicrous to think that increasing the number of cells in a worksheet increases the value of new versions by a comparable amount. In fact, many of the most active Excel users prefer the 2003 version, presumably because the genuine improvements are outweighed by various hidden costs.

Similarly, many companies still prefer to use Microsoft's Windows 7 because the newer versions have features that aren't particularly useful for workers sitting at desks.

[Dec 27, 2013] There was a problem sending the command to the program error in Excel

Microsoft Excel 2007
  1. Click the Microsoft Office Button, and then click Excel Options.
  2. Click Advanced, and then clear the Ignore other applications that use Dynamic Data Exchange (DDE) check box in the General area.
  3. Click OK.

[May 1, 2013] With Great Power . . . by James Kwak

May 1, 2013 | 13 Comments

By James Kwak

A friend brought to my attention another example of how Excel may actually be a precursor of Skynet, after the London Whale trade and the Reinhart-Rogoff controversy. This comes to us in a research note from several years ago by several bioinformatics researchers titled "Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics." The problem is that various genes have names like "DEC1″ or identifiers like "2310009E13." When you important those text strings into Excel, by default, the former is converted into a date and the later is converted into scientific notation (2.310009 x 10^13). Since dates in Excel are really numbers behind the scenes (beginning with January 1, 1900), those text identifiers have been irretrievably converted into numbers.

This problem is related to what makes Excel so popular: it's powerful, intuitive, and easy to use. In this case, it is guessing at what you really mean when you give it data in a certain format, and most of the time it's right-which saves you the trouble of manually parsing text strings and converting them into dates (which you can do using various Excel functions, if you know how). But the price of that convenience is that it also makes it very easy to make mistakes, if you don't know what you're doing or you're not extremely careful.

There are workarounds to this problem, but as of 2004, it had infected several public databases. As the authors write, "There is no way to know how many times and in how many laboratories the default date and floating point conversions to non-gene names have adversely affected an experiment or caused genes to 'disappear' from view."

[Feb 9, 2013] The Importance of Excel By James Kwak

February 9, 2013 | baselinescenario.com | 35 Comments

I spent the past two days at a financial regulation conference in Washington (where I saw more BlackBerries than I have seen in years-can't lawyers and lobbyists afford decent phones?). In his remarks on the final panel, Frank Partnoy mentioned something I missed when it came out a few weeks ago: the role of Microsoft Excel in the "London Whale" trading debacle.

The issue is described in the appendix to JPMorgan's internal investigative task force's report. To summarize: JPMorgan's Chief Investment Office needed a new value-at-risk (VaR) model for the synthetic credit portfolio (the one that blew up) and assigned a quantitative whiz ("a London-based quantitative expert, mathematician and model developer" who previously worked at a company that built analytical models) to create it. The new model "operated through a series of Excel spreadsheets, which had to be completed manually, by a process of copying and pasting data from one spreadsheet to another." The internal Model Review Group identified this problem as well as a few others, but approved the model, while saying that it should be automated and another significant flaw should be fixed.** After the London Whale trade blew up, the Model Review Group discovered that the model had not been automated and found several other errors. Most spectacularly,

"After subtracting the old rate from the new rate, the spreadsheet divided by their sum instead of their average, as the modeler had intended. This error likely had the effect of muting volatility by a factor of two and of lowering the VaR . . ."

I write periodically about the perils of bad software in the business world in general and the financial industry in particular, by which I usually mean back-end enterprise software that is poorly designed, insufficiently tested, and dangerously error-prone. But this is something different.

Microsoft Excel is one of the greatest, most powerful, most important software applications of all time.** Many in the industry will no doubt object. But it provides enormous capacity to do quantitative analysis, letting you do anything from statistical analyses of databases with hundreds of thousands of records to complex estimation tools with user-friendly front ends. And unlike traditional statistical programs, it provides an intuitive interface that lets you see what happens to the data as you manipulate them.

As a consequence, Excel is everywhere you look in the business world-especially in areas where people are adding up numbers a lot, like marketing, business development, sales, and, yes, finance. For all the talk about end-to-end financial suites like SAP, Oracle, and Peoplesoft, at the end of the day people do financial analysis by extracting data from those back-end systems and shoving it around in Excel spreadsheets. I have seen internal accountants calculate revenue from deals in Excel. I have a probably untestable hypothesis that, were you to come up with some measure of units of software output, Excel would be the most-used program in the business world.

But while Excel the program is reasonably robust, the spreadsheets that people create with Excel are incredibly fragile. There is no way to trace where your data come from, there's no audit trail (so you can overtype numbers and not know it), and there's no easy way to test spreadsheets, for starters. The biggest problem is that anyone can create Excel spreadsheets-badly. Because it's so easy to use, the creation of even important spreadsheets is not restricted to people who understand programming and do it in a methodical, well-documented way.***

This is why the JPMorgan VaR model is the rule, not the exception: manual data entry, manual copy-and-paste, and formula errors. This is another important reason why you should pause whenever you hear that banks' quantitative experts are smarter than Einstein, or that sophisticated risk management technology can protect banks from blowing up. At the end of the day, it's all software. While all software breaks occasionally, Excel spreadsheets break all the time. But they don't tell you when they break: they just give you the wrong number.

There's another factor at work here. What if the error had gone the wrong way, and the model had incorrectly doubled its estimate of volatility? Then VaR would have been higher, the CIO wouldn't have been allowed to place such large bets, and the quants would have inspected the model to see what was going on. That kind of error would have been caught. Errors that lower VaR, allowing traders to increase their bets, are the ones that slip through the cracks. That one-sided incentive structure means that we should expect VaR to be systematically underestimated-but since we don't know the frequency or the size of the errors, we have no idea of how much.

Is this any way to run a bank-let alone a global financial system?

* The flaw was that illiquid tranches were given the same price from day to day rather than being priced based on similar, more liquid tranches, which lowered estimates of volatility (since prices were remaining the same artificially).

** But, like many other Microsoft products, it was not particularly innovative: it was a rip-off of Lotus 1-2-3, which was a major improvement on VisiCalc.

*** PowerPoint has an oft-noted, parallel problem: It's so easy to use that people with no sense of narrative, visual design, or proportion are out there creating presentations and inflicting them on all of us.

Update 2/10: There is an interesting follow-on discussion that includes a lot of highly-informed technical people, including some who work in finance, over at Hacker News.

35 Responses to The Importance of Excel

  1. Sash | February 9, 2013 at 3:43 pm |

    I think there is a more fundamental problem, the model distorts the nature of the reality. Back in the late 80s i spent a summer in one of NY's largest real estate firms. At the time spreadsheets were relatively new.

    My boss observed that the spreadsheet made people think that real estate was a bond and not real estate.

  2. cubicamente | February 9, 2013 at 4:11 pm |

    In my opinion, if a 'model devoloper' has left to his users the possibility to copy and paste or something like that, he did a bad job. With my work, I use Excel spreadsheets every day but when I start to develop a project, I have in my mind a system with an engine (the spreadsheet with formulas and macros) and with an user interface, with icons that prevent at users to do anything not provided. So, the core of the spreadsheet can work without modifications, and results can be more reliable.

  3. Vic Volpe | February 9, 2013 at 6:06 pm |

    Do you suppose they use Excel because they don't know how to do statistical analysis in a database program?

  4. BillK | February 9, 2013 at 6:07 pm |

    Points out the fact that spreadsheet models should be reviewed and checked by other experts BEFORE they go into production systems.

  5. David | February 9, 2013 at 7:34 pm |

    I am a professional software engineer, and what I design and build works because I know what I'm doing. I have no sympathy for any of this.

    Programming IS an extreme sport. Leave it to the professionals, and do NOT attempt it yourself at home.

  6. Jason | February 9, 2013 at 7:38 pm |

    " Errors that lower VaR, allowing traders to increase their bets, are the ones that slip through the cracks. That one-sided incentive structure means that we should expect VaR to be systematically underestimated . . .."

    This. A thousand times, this.

    They'd probably use Ouija boards if they justified high-risk/high-reward strategies. ;)
    After all, what happened to ratings agencies that dared to call crap "crap?"

  7. azlib | February 9, 2013 at 11:55 pm |

    As a software engineer what you describe is all too common. Excel is very powerful because it is relatively easy to use, but its ease of use means anybody can write a program using the tool which can be very difficult to understand or test.

    What amazes me that the model was not tested more thoroughly considering the risks if there were systematic errors in the calculations.

  8. colby | February 10, 2013 at 12:13 am |

    cubicamente – I agree, I follow the same logic. The users are dumb so prevent them from doing anything other than what you build the tool to do.

    However, to James' point, anyone can create a spreadsheet, and their is no formal user acceptance testing, unit testing, stress testing, etc that occurs with the normal SDLC. I have been stating for many years at my current employer that we should not be running our business on an Excel spreadsheet, even when we run SAP. There is far to much downside risk and limited upside from the productivity. Additionally, I think it makes people dumber and less likely to critically think through situations.

  9. aronjohnson | February 10, 2013 at 12:48 am |

    The mistake made in the London Whale case could have been easily caught using the Formulas/Trace Precedents and Trace Dependents tools.This tool is invaluable in debugging complex spreadsheets.

  10. osamakhn | February 10, 2013 at 1:22 am

    I guess this is where python and its libraries come in. You can build reliable models using code. Maybe its time the curriculum of mba/accounting/finance programs are also curated to add courses in python and financial modelling instead of just the fin engineers and quants doing the modelling.

  11. Robert Klemme | February 10, 2013 at 7:42 am

    "The biggest problem is that anyone can create Excel spreadsheets-badly. Because it's so easy to use, the creation of even important spreadsheets is not restricted to people who understand programming and do it in a methodical, well-documented way."

    Unfortunately this is not only true for Excel and Powerpoint but for other parts of common office suites as well. One of my pet peeves is text documents where empty paragraphs are used to set paragraphs visually apart. Another one is the abundance of ad hoc formatting which increases size of documents and makes them look odd.

  12. Bench Warmer | February 10, 2013 at 7:46 am

    I just like the old copy and paste format.
    http://sports.yahoo.com/blogs/mlb-big-league-stew/photos-marlins-fans-turn-winter-warm-event-220646017–mlb.html

  13. Jim Dikelow | February 10, 2013 at 8:09 am

    There is a rich literature (accessible through Google Scholar) on the deficiencies of Excel for ststistical calculation and the unwiilingness or inability of Microsoft to fix them. Much better is the open source (and free) Excel clone, Gnumeric.

  14. Moses Herzog | February 10, 2013 at 8:23 am

    Well, I think it's safe to say I've been one of the more loyal and LONGTIME readers of this site. I joined in on the dialogue here after Simon's first appearance on Bill Moyers. I wanna say that was 2008, but if it wasn't 2008 it was 2009. I gladly purchased both books, and I enjoyed them immensely and learned a lot from them (I learned more from "13 Bankers" than from "Burning" but I walked away with better understand and knowledge base from both).

    But I think this will probably be my last post here. The "sweet spot" from Mike Konczal's old site (Rortybomb) really left after he went to the Roosevelt Institute. Konczal's writing drastically changed after that. And I'm afraid that time has come to this site as well. Too much time intervals between posts, too many tease links to other sites where after years of loyalty to the two hosts we get the crumbs from the paid sites tossed at us for increased link hits.

    And frankly, too many phrases like this one I don't remember seeing in this site's early days: "where I saw more BlackBerries than I have seen in years-can't lawyers and lobbyists afford decent phones?"
    I suppose people would say I am being "hypersensitive" to be upset at such a small passing remark. But class snobbery, and judging people on socio-economic issues was kind of what I had thought this site was supposed to be fighting against, not encouraging. I don't think James Kwak would have written that in the first 6 months-1yr+ of when this site was at its best. So to me, it's like picking up a glass of what you thought was your favorite brand of soda pop, but when you ran it on your tongue you found they had substituted the sugar with high fructose corn syrup, and it just wasn't the same.

    When I think of Konczal repeating liberal talking points VERBATIM (not because they are the best, but as if he phoned the ACLU and Berkley Liberal Arts Dept. before every post to see if it was kosher) and James Kwak saying things like "where I saw more BlackBerries than I have seen in years-can't lawyers and lobbyists afford decent phones?" I think I am tasting the Dr. Pepper with high fructose corn syrup and not the Dr Pepper with sugar ("Rortybomb" and James Kwak circa 2009-20??) that I liked so much.

    Warm Regards,
    Resentful and Bitter Zeitgeist crouching under the keyboard.

  15. Marc Heller | February 10, 2013 at 10:28 am

    Dear Moses -
    I've never commented on this blog before, but I always read what others have to say. I hope you reconsider you decision to not continue making your ideas known on this blog. I, for one, look forward to what you have to say.

  16. Dan Palanza | February 10, 2013 at 10:54 am

    Let me add to Moses Herzog's critique: Your topic today, whether you know it or not, is a bookkeeping problem/solution pattern. I discovered, 35 years ago, when I bought my first microprocessor-driven computer, that it is impossible to create a proper double-entry book-of-accounts using any spreadsheet, no matter how it is designed.

    There are four fundamenta variables in a proper book-of-accounts: debtor-value, creditor-rights, cash, and capital. Pacioli calls the bookkeepers attention to these four variables in 1494 A.D. The four are isomorphically related to one another within the contract that the bookkeeping sets out to resolve; read as "sets out to balance."

    Until these bookkeeping fundamentals are understood by the software community, and the accounting software designers know how to create the four-way isomorphic balance by using a serious programming language, the banks will continue to steal from the monetary system.

    Right now, James, your above essay has no right to take up the bandwidth that you are wasting by your not knowing where first base is located in the playing-field of software-driven accounting.

  17. Colin | February 10, 2013 at 11:30 am

    The FT Alphaville team have a very good overview of this - http://ftalphaville.ft.com/2013/01/17/1342082/a-tempest-in-a-spreadsheet/

  18. MacCruiskeen | February 10, 2013 at 12:17 pm

    "PowerPoint has an oft-noted, parallel problem"

    Actually, the real problem with PP is that it, at root, has no actual value at all. Anything that can be communicated with PP can be communicated better with some other media: verbally, paper, PDF, whatever. Heck, most presentations and meetings are really unnecessary.

  19. Cornbread | February 10, 2013 at 12:56 pm

    I don't use Excel except when I am required to. There's several reasons for this. First off, it's ability to do complex mathematics is incredibly poor. If you're not dealing with something that was linear in the first place Excel is not the best program for it. If you've got phenomena operating in opposing directions it's not terrible if you can linearize them and run in very small steps. But if you've got coupled equations that aren't necessarily positively or negatively reinforcing, Excel should be last on the list.

    Second problem is documentation. Excel spreadsheets are notorious for being poorly documented. Shoot, it's difficult to see who came up with it in the first place, then you've got to comb through the attached VBA and the cell formulas to find it.

    Third, mobility of data. I've seen many instances where Excel doesn't correctly or consistently pull from a database or update the database correctly. It was certainly a matter of poor coding, but Excel doesn't seem to consistently write to databases from machine to machine.

    What Excel is pretty good for is dumbed-down visualization, ie, when you're presenting technical stuff but need to make it ultra pedestrian for the business school guys, nothing's better than Excel's cartoonish graphics.

  20. Austin | February 10, 2013 at 1:06 pm

    any "quant" worth their weight should be intimately familiar with SAS and SQL and deliver their results via Excel so the "business" can lipstick that pig.

  21. Oregano | February 10, 2013 at 1:24 pm

    @Moses, I have to agree with you. Baselinescenario was created to discuss the reasons behind and way forward from the Crash of 2008-2009 and had its greatest relevancy then. Now, it's mostly a site waiting and hoping for something positive to happen in a political environment that has regrouped and moved on in the same general direction it was heading before the crash. Simon and James have spoken their peace, written their books, and are moving on as well. There needs to be a change in national and Beltway politics that can reestablish a nexus of pragmatic governance for this site to regain its vitality. I'm hoping that my child will see that day. I suspect I won't.

  22. aronjohnson | February 10, 2013 at 1:35 pm

    I totally disagree with this disgusting comment!!! Excel has several powerful tools although it's solver function could be improved for non linear responses and coupled models with over 4 degrees of freedom (a convergence issue). For after tax cash flow analysis (in Engineering Economics class), I once programmed in Excel the entire IRS Depreciation schedule with just two inputs and a lot of conditional logic. (I found an error in the published tables which they fixed.)

    It does not do PDE's which economists seem to love and practical engineers regard as oversimplifications of reality since most of them ignore friction (economic parallel??). It doesn't do computational fluid dynamics either but we have good programs like Fluent which do. In addition to the wonderful flow visualizations, they provide the data we need in spreadsheet format for further analysis.

    I like Excel because I can check the individual calculations which are hidden by most programs.

  23. Keith Stephenson | February 10, 2013 at 2:28 pm

    By the way from what I have read, A program that excel basically copied, Lotus 1-2-3, was basically the 'killer app' of personal computers which made forever changed accountants jobs for the better. The PC revolution in the early 80′s took off from there. It that way the spreadsheet is the most important application of all time.

  24. Anonyomouse | February 10, 2013 at 6:02 pm

    I sympathize with Moses's concerns. When statistics and numbers are manipulated to favor the ruling classes agenda, and since we it has been proved that politicians, (at this point in time), can no longer distinguish right from wrong, or right from the left, or anything other than the current status quo. We really do have a dysfunctional congress, that takes money candidates over quality ones. And we have a wall street that is in denial to it's dying breath.
    I myself don't know how it can sustain itself, and had I the power, I would expose them all.
    It's really only a matter of time as to when, and how, the reset equation is used, so the strong CAN carry on. Until then we must put up with disinformation politics, and economics, because it has become the norm.

  25. Jack | February 10, 2013 at 6:26 pm

    Blaming Excel is like blaming guns for gun crime. There are many technical ways to potentially overcome the 'copy and pasting' problem. For example using templates, locked cells/sheets, Excel Services, document-level customizations etc.

    The problem described is NOT a technical one. Its social or organizational. What is it about the banks and those they employ which allow casual and unprofessional 'hacks' to be built for mission critical activity? Why don't they know their tools well? Where is the training? Where is choosing the right tool for the job? Where is the technical oversight and leadership?

  26. Worthy LaFollette | February 10, 2013 at 7:32 pm

    Excel, Powerpoint and Access all suffer from similar issues – they are tools which are overused by amateurs. What is described above for Excel is exactly the same sort of critique Tufte has of Powerpoint – namely it creates a false sense of competence and is overused in places it should not be in order to satisfy the gods of "convenience".

  27. Derek Burden | Joe Smith | February 10, 2013 at 8:16 pm

    I think so se sight of the original facts. The bank contracted out the building of a model to an "expert" who made a programming error. I agree with the original post that it is too easy to make such errors in Excel. Excel should probably not be used for production calculations but when it is, the model should be properly tested and documented. I wonder what the state of the art is for standards and tools to document and test spreadsheets. (VisiCalc was the original killer app – Lotus 123 was a clone written for the IBM-PC)

  28. kylehreed | February 10, 2013 at 8:29 pm

    I just discounted everything you wrote because of your hysterical and gratuitous Microsoft bashing. By the way, Lotus had an innovative commanding system but still essentially ripped off VisiCalc and Multiplan. Excel was the first with a GUI.

  29. ShamelessBill | February 10, 2013 at 9:26 pm

    @David (professional software engineer): Your point concerning the importance of experience, training, and competence is well taken. Having said that,…
    With all due respect for your doubtless considerable expertise in building software that does the work it is designed to do, please bear in mind this well established and humbling fact: Every human being is subject to marvelously seductive illusions of attention, memory, confidence and knowledge.
    Awareness of their existence and power immunizes none of us against these illusions. Self-deception afflicts each of us, bar none.

  30. Celer | February 10, 2013 at 10:05 pm

    "After subtracting the old rate from the new rate, the spreadsheet divided by their sum instead of their average, as the modeler had intended. This error likely had the effect of muting volatility by a factor of two and of lowering the VaR . . ."

    "As the modeler had intended"….what?

    The spreadsheet didn't do anything the modeler screwed up. Excel is singularly unimaginative and uncreative and suffers from being able to do one linear thing at a time which undermines supposedly nice features like Real TIme Data links etc. Oy a+b vs (a+b)/2…..or such.

  31. Celer | February 10, 2013 at 10:18 pm

    Believing in Models in the first place is a far bigger error perhaps than misconstructing them…a model allows humans to withdraw from responsibility for their actions. Models are only models. Reality is far more complex than a model and the assumption a model captures reality in a suitably complete manner to be the sole basis of decision making is stupid in itself.

    In any case the dialogue set out above gets sucked into the bs of technical matters when in fact there were some very obvious 'non-technical' matters that should have constrained the Whale or led to his constraint by a supervisor (not a committee with no apparent singular person held accountable) like an overall position limit for example which requires no VAR to ascertain and could easily have been summed up on one or two sheets of an Excel workbook.

    And so through technical arguments, critiques of spreadsheets, committees
    we lose track of the fact that the Whale was out of control and no one is ultimately held responsible in a significant way. Controlling unbridled corporate Banking Greed would go much further than quiblling about models to prevent trading abuses – a class and moral approach to controlling criminal behavior.

[Oct 20, 2010] Recurring formula for new records

Excel Help Forum

...have an Excel table. Each row is a separate record. The user types in new records at the end of the table. One of the columns in the table contains a formula. Ideally the formula should apply for each cell in the entire column. Since I don't know how long the table will end up being, I simply ask that the user drag the formula down from the cell above. I do not want the user to accidentally change the formula though. At first I thought a Custom validation with a secret word that the user is unlikely to type in, would keep the user from changing the formula already in the cell. However, I can't stop the user from deleting the formula, which doesn't help. I then thought maybe locking (protecting with password) the column would do the trick. However, now the user can't drag down the formula from the cell above. I would therefor like help with one of 2 options:

1- have the formula apply to the cell automatically as a new record is created (ideal)
2- find a way to protect the cell so that the user can copy the formula down, but not change or delete it

Any brain waves?

I did find the following code online, which appears to go some ways to answer my first option, though I can't figure out why it only works if the formula is in column A (i.e. I have formulas in columns A, B, K, L, and AF to DL).

Code:

.alt2 font { font: 11px monospace !important; color: #333 !important; } body.dp {} .tiny { font-family: verdana,arial,helvetica,sans-serif; font-size: xx-small; } .h3color { font-family: verdana,arial,helvetica,sans-serif; color: #E47911; font-size: x-small; }
Option Explicit 
 
Private Sub Worksheet_Change(ByVal Target As Range) 
    Dim r As Range, c As Range, t As Range 
     
    On  Error Goto TheEnd 
     
    If Not  Intersect(Target, [E9:E65536]) Is Nothing Then 'Skips B1.  Titles usually on first 7 rows. Row 8 has first  formulas. Column E has first user typed entry
        MsgBox Intersect(Target, [E9:E65536]).Address & " was changed." 
        Application.EnableEvents = False 
        Application.Calculation = xlCalculationManual 
        Application. ScreenUpdating = False 
         
        For Each t In Target 
            Set r = Union(Range("A" & t.Row), Range("C" & t.Row, Cells(t.Row, 19))) 
            If VarType(t.Value) = vbEmpty Then 
                 'r.ClearContents  'Clear all contents in row where B is empty.
                For Each c In r 'Clear only formulas where B is set to empty.
                    If c.HasFormula Then c.Clear 
                Next c 
                Goto nextt 
            End If 
            For Each c In r 
                If c.Offset(-1, 0).HasFormula Then c.Offset(-1, 0).Copy c 
            Next c 
nextt: 
        Next t 
         
TheEnd: 
        Application.EnableEvents = True 
        Application.Calculation = xlCalculationAutomatic 
        Application.ScreenUpdating = True 
    End If 
     
End Sub

Last edited by m.cain; 02-21-2008 at 07:02 PM. Reason: found possible code, though not sure how to use

ATM • Spreadsheet Files

Here we present a huge collection of spreadsheet files which originally appeared on a CD which was sent, free, to members of ATM. We have now negotiated permission to make them available here.

But, if you're not a member, please consider joining ATM. Not only will you be supporting the further development of resources such as these but you will also receive the other benefits of belonging to one of the leading professional associations for mathematics educationalists.

ATM • Problem Solving with Interactive Spreadsheets

ATM's ICT publication is a CD of 16 programs especially designed to be used with any interactive whiteboard or projector. The programs provide a variety of teaching opportunities for KS2, KS3 and KS4 classrooms and because they are so flexible, almost all can be used at all these key stages.

<< Samples from Problem Solving with Interactive Spreadsheets

The CD associated with this book contains sixteen interactive spreadsheets which have been developed to pose interesting and challenging problems. The tasks, many of which are well known, have been transferred to this format to help teachers offer learning opportunities to pupils that help develop their skills of mathematical reasoning.

The files will work with Excel and most other spreadsheet applications.

The problems require little prior knowledge in terms of explicit mathematical content. The aim is to provide problems that are challenging but accessible to students in the upper primary and secondary age ranges. Most of the problems include a range of extension problems to challenge even the most able.

The files are interactive so that as learners respond to the tasks and questions posed the next tasks appear on the screen automatically. The teacher's guide offers suggestions for further extension and hints concerning solutions and proofs.

[Nov 5, 2006] http://www.exceltip.com/st/TipofTheDay/295.html

Today's Tip

*Applying Colors to Maximum/Minimum Values in a List*


*To apply colors to maximum and/or minimum values:*

Rounding all the numbers in column A to zero decimal places, except for those 
that have "5" in the first decimal place.

1. Select a cell in the region, and press Ctrl+Shift+* (in Excel 2003, press 
this or Ctrl+A) to select the Current Region.
2. From the Format menu, select Conditional Formatting.
3. In Condition 1, select Formula Is, and type =MAX($F:$F) =$F1.
4. Click Format, select the Font tab, select a color, and then click OK.
5. In Condition 2, select Formula Is, and type =MIN($F:$F) =$F1.
6. Repeat step 4, select a different color than you selected for Condition 1, 
and then click OK.

*Note:*
Be sure to distinguish between absolute reference and relative reference when 
entering the formulas.

See screenshot for a sample

<http://www.exceltip.com/images/screenshots/295.gif>  

[Oct 24, 2006] Microsoft Office Assistance You asked for it Crabby's top 5 Excel tips and tricks

Double-click your way to fast formatting

By default, when you copy text from a different program (such as Microsoft Word, Microsoft PowerPoint®, or even Microsoft Internet Explorer) and paste it into an Excel cell, the original formatting comes along for the ride. Sometimes this leaves you with a tiny little cell with GIGANTIC FORMATTING that you need to fix by hand.

The Paste Options button that looks like this:

pops up, and you can use it to keep the original formatting or to match the destination formatting (that is, the formatting you've chosen for your cell). However, if you're like me (or want to be like me - aren't you sweet), that is just too many steps if you know that you want to match the formatting of the cell. This is especially true if you have a another program and keep your cell's formatting

  1. Select the text you want to pop into your Excel worksheet.
  2. Press CTRL+C.
  3. Switch back to Excel.
    Tip Hey! Now's a good time to use that shortcut I told you about in the first tip!
  4. Double-click in the cell, and press CTRL+V.

It's like magic in a click (or two).

And so you have it. Five decent little tips to boost your skill level and keep your work humming along. So start the music!

"We are what we repeatedly do. Excellence, then, is not an act, but a habit." - Aristotle

[Oct 24, 2006] 4 quick tips for using Excel

Creating a drop-down menu

Lynda Morris, president of NicLyn Consulting Corp, an Internet-based computer and network service company, often finds herself entering the same information or formulas in different parts of a spreadsheet. Instead of typing data repeatedly, she creates drop-down menus that save typing time. To create a drop-down menu:

Open a worksheet and label your menu (Days for example) in the top cell of a column.
Enter data (days of the week, in this case) in the cells below the label (such as A2 to A8).
Highlight the list, including the label.
Click the Insert menu, then click Name, and then Define.
In the box under Names in workbook you will see the name of the range (Days). In the Refers to box, it should list "=Sheet1!" followed by the range (=Sheet1!$A$1:$A$8).
Click Add and then OK.
Select any cell or range of cells in which you want to enter data.
Click the Data menu and then Validation.
From the Allow drop-down menu, select List. In Source, type "=" and the name of your range (=Days). Make sure that the In-cell dropdown box is checked.
Click one of the cells you highlighted to see your drop-down menu. Choose the item you want to enter in the active cell. Enter data from your list, in any order, in each of the cells you selected.

[May 22, 2006] freshmeat.net Project details for SpreadsheetWriteExcel 2.17 by John McNamara

About: Spreadsheet::WriteExcel is a Perl module which can be used to create native Excel binary files. Formatted text and numbers can be written to multiple worksheets in a workbook. Formulas and functions are also supported. It is 100% Perl and doesn't require any Windows libraries or a copy of Excel. It will also work on the majority of Unix and Macintosh platforms. Generated files are compatible with Excel 97, 2000, 2002, and 2003, and with OpenOffice and Gnumeric. An older version also supports Excel 5/95.

Changes: This release includes minor bugfixes and enhancements.

[Feb 16, 2006] Write VBA with the Macro Recorder in Word and Excel Take advantage of autogenerated code to speed up your coding efforts.

[Jan 13, 2004] Excel 2003 Power Programming with VBA (Excel Power Programming With Vba)

by John Walkenbach (Paperback - Jan 13, 2004)

Buy a book to match your background and needs, February 11, 2004
Needing to finish my first Excel add-in, and frustrated by the incompleteness and obscurity of MS's help system, I picked up this book after reading warm recommendations from readers of earlier versions.

If you have never programmed Excel before, but have programmed a tiny bit in some other language, and do not have great ambitions for software development, this might be a fine text. It is quite readable and full of useful information. Walkenbach introduces VBA quickly, which is great, but so quickly he forgets to say what most of the language constructs do. His approach to teaching the Excel object model is to provide several fairly well written examples of little macros and utilities, each one with a clear English explanation. Unfortunately, if the technique you need does not appear in any of these examples, you are out of luck, because his explanations are neither extensive, detailed, nor thorough enough to impart a good understanding of what is going on. This, coupled with Excel's erratic behavior (mis-type a property name and watch your user form mysteriously disappear, for instance), makes it very difficult to become independently productive without spilling a lot of sweat and tears.

The book's strengths include the numerous and well-organized examples provided on the companion CD; the occasional sidebar that offers first-hand knowledge of bugs, inconsistencies, and strange design; fairly broad, if incomplete, coverage of the major aspects of Excel VBA programming; and very clear indications of differences among various Excel versions (97, 2000, 2003 mainly). Walkenbach is obviously an expert and has been so for a long time.

The weaknesses become apparent in contrasting this book with, say, Roman's text (O'Reilley). Where Walkenbach gives a macro to display all the icons associated with the several thousand Excel 'FaceId's, Roman publishes the complete table as an appendix. Where Walkenbach loosely skims over the properties of many key objects, such as ranges and charts, Roman takes the time to provide a terse but useful description of nearly every property, as well as a very illuminating diagram of the object hierarchy. Where Walkenbach completely omits to describe how VBA works, Roman actually offers a deeper explanation (showing how object references are arranged in memory, for instance, and describing exactly how a for..next loop is executed). Boring stuff for some, maybe, but a huge time saver for those who appreciate that the details matter. For someone who either has a lot of programming experience, or who plans to develop more than toy utilities or one-off apps in Excel VBA, Roman's approach is much more useful than Walkenbach's.

If Walkenbach is appropriate for your background and ambitions, then you will probably agree it is a four- or five-star effort. Otherwise, you will likely be somewhat disappointed and, like me, will quickly find yourself looking for another book.

Excel 2003 Formulas Books

by John Walkenbach

Walkenbach and Excel...a great combination!, November 24, 2003
Reviewer: William A. Huber (Rosemont, PA USA) - See all my reviews
Reviewer: Claire (Jasper, Indiana) - See all my reviews
Excel 2003 Formulas provides some amazing ways to really master Excel. Walkenbach's formulas are the best! He covers operators, naming techniques, debugging, auditing, developing custom VBA functions, array formulas, imported 1-2-3 files, etc. Oh, and the CD that comes with this one is load too! Keep it within arms reach, August 19, 2005
Reviewer: Mark Jeffrey Lipin (Arizona) - See all my reviews
If you only get 1 book on Excel, buy this book! Its simplicity of use and easy reading style make it a perfect companion for a basic to an experienced Excel user (or abuser).

Excel Scientific and Engineering Cookbook (Cookbooks (O'Reilly)) Books David Bourg

Great showcase of Excel's mathematical capabilities, February 6, 2006
Reviewer: calvinnme "Texan refugee" (Fredericksburg, Va) - See all my reviews
It is often either inconvenient or financially unfeasible to solve every mathematical problem with a standalone procedural language or an expensive application such as Matlab or Mathematica. Since Excel has such advanced computational capabilities built into it and is ubiquitous in offices and homes, it would make sense to mine this application for numerical problem solving techniques. Bourg does a masterful job of presenting the tools available in Excel and showing the reader how to put them to work to solve real world engineering and mathematical problems. First, he spends a couple of chapters briefly going over Excel and its language VBA (Visual Basic for Applications). Next he goes over some intermediate level tasks such as collecting and cleaning up data, charting including 3D charts, and statistical analysis tasks such as correlating data and generating random numbers. Next Bourg moves into purely mathematical tasks such as working with discrete Fourier transforms, manipulating matrices and vectors, as well as working with basic mathematical functions. Once you have learned to use these mathematical tools in Excel, Bourg uses them to show you how to perform curve fitting and regression, solve equations, perform integration and differentiation, and solve both ordinary and partial differential equations. He even spends time on performing multivariable calculus and the finite element method in Excel. Chapter 13, on optimization, was my favorite chapter. Here, Bourg is actually getting into a little algorithm analysis and design via linear programming and genetic algorithms. This chapter showed me some truly innovative uses of Excel. The final chapter deviates from the scientific flavor of the book and concentrates on more traditional financial applications.
The writing is very accessible, the examples are clear and very creative, and the author does a tremendous job of capturing the range of Excel's mathematical capabilities. Do not expect mathematical theory in this book. Bourg assumes you already know how to set up a mathematical problem and that you just need a computational platform and tools with which to perform your work. It is nice to know I am not necessarily held hostage by Mathworks(maker of Matlab) every time I need to solve a mathematical problem of any level of sophistication. Highly recommended. The table of contents is as follows:
1. Using Excel
2. Getting Acquainted with Visual Basic for Applications
3. Collecting and Cleaning Up Data
4. Charting
5. Statistical Analysis
6. Time Series Analysis
7. Mathematical Functions
8. Curve Fitting and Regression
9. Solving Equations
10. Numerical Integration and Differentiation
11. Solving Ordinary Differential Equations
12. Solving Partial Differenby Gerald Knight tial Equations
13. Performing Optimization Analyses in Excel
14. Introduction to Financial Calculations

Analyzing Business Data with Excel

by Gerald Knight

Recommended Links

oreilly/Excel Annoyances

Free Excel Training. Basic-Level 1 Free Excel Training

  1. Excel Lesson 1: The Fundamentals of Excel

  2. Excel Lesson 2: How to Start Excel, Creating and Saving Workbooks

  3. Excel Lesson 3: Toolbars and Task Panes in Excel

  4. Excel Lesson 4: Insert/Delete/Name/Move Worksheets

  5. Excel Lesson 5: Cells, Moving Around Workbooks, Exiting

  6. Excel Lesson 6: Basic Copy/Cut & Paste

  7. Excel Lesson 7: Quick Copying Using the Fill Handle

  8. Excel Lesson 8: Pasting Using Paste Special

  9. Excel Lesson 9: Inserting Rows, Columns and Cells

  10. Excel Lesson 10: Changing the Standard Default Options in Excel

  11. Excel Lesson 11: Using the Undo and Redo Feature

  12. Excel Lesson 12: Using the Format Painter to Copy a Cells Format

  13. Excel Lesson 13: Using Dates and Times in Excel

  14. Excel Lesson 14: Working with Custom Formats in Excel

  15. Excel Lesson 15: The Basics of Excel Formulas

  16. Excel Lesson 16: Excel Cell References. Relative and Absolute Cell References

  17. Excel Lesson 17: Avoid Typing Whenever Possible. Use Your Mouse Pointer

  18. Excel Lesson 18: Excel Arguments and Syntax in Formulas

  19. Excel Lesson 19: Excel AutoSum Function/Formula

  20. Excel Lesson 20: Excel Auto Calculate. Calculate Without Formulas

  21. Excel Lesson 21: How to Use the Insert Function Feature to Create Formulas

  22. Excel Lesson 22: Using Some of Excel's most Common Functions

  23. Excel Lesson 23: Using Named Ranges in Excel as an Alternative to Cell References

  24. Excel Lesson 24: Using Constants to Name Values and the Paste Name Dialog

  25. Excel Lesson 25: Excel Calculations/How Excel Calculates/The Order of Calculations

  26. Excel Lesson 26: Excel Cell Comments

  27. Excel Lesson 27: Excel Find & Replace

  28. Excel Lesson 28: Different Methods of Clearing Cell Contents

  29. Excel Lesson 29: Effective Printing in Excel 1

  30. Excel Lesson 30: Effective Printing in Excel 2

  31. Excel Lesson 31: Sorting Data in Excel

  32. Excel Lesson 32: Hide/Show Rows/Columns in Excel

  33. Excel Lesson 33: Excel AutoFormats

  34. Excel Lesson 34: Creating a Basic Spreadsheet

  35. Excel Lesson 35: Charting the Basic Spreadsheet

  36. Excel Lesson 36: Excel Worksheet Protection

  37. Excel Lesson 37: Excel IF Formula/Function

  38. Excel Lesson 38: Volatile Functions-Now & Today

Microsoft Excel Tips from Excel Tip .com - Excel Tutorial - Free Excel Help

Excel VBA Examples

EXCEL and EXCEL/VBA examples

VBA Programming in Excel

Excel Viewer 2003

Recommended Articles

oreilly.com -- Online Catalog Integrating Excel and Access

oreilly.com -- Online Catalog Excel The Missing Manual

oreilly.com -- Online Catalog Excel 2003 Personal Trainer

oreilly.com -- Online Catalog Excel Scientific and Engineering Cookbook

oreilly.com -- Online Catalog Excel 2000 in a Nutshell, First Edition

oreilly.com -- Online Catalog Excel 2003 Programming A Developer's Notebook

Excel Object Model

oreilly.com -- Online Catalog Writing Excel Macros with VBA, Second Edition



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2020 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: July 30, 2020