Book contents
- Frontmatter
- Dedication
- ANNELI LAX NEW MATHEMATICAL LIBRARY
- Contents
- Preface
- Acknowledgments
- 1 X Marks the Spot
- 2 Entering the Matrix
- 3 Sum Matrices
- 4 Fitting the Norm
- 5 Go Forth and Multiply
- 6 It's Elementary, My Dear Watson
- 7 Math to the Max
- 8 Stretch and Shrink
- 9 Zombie Math—Decomposing
- 10 What Are the Chances?
- 11 Minning for Meaning
- 12 Who's Number 1?
- 13 End of the Line
- Bibliography
- Index
11 - Minning for Meaning
- Frontmatter
- Dedication
- ANNELI LAX NEW MATHEMATICAL LIBRARY
- Contents
- Preface
- Acknowledgments
- 1 X Marks the Spot
- 2 Entering the Matrix
- 3 Sum Matrices
- 4 Fitting the Norm
- 5 Go Forth and Multiply
- 6 It's Elementary, My Dear Watson
- 7 Math to the Max
- 8 Stretch and Shrink
- 9 Zombie Math—Decomposing
- 10 What Are the Chances?
- 11 Minning for Meaning
- 12 Who's Number 1?
- 13 End of the Line
- Bibliography
- Index
Summary
From smartphones to tablets to laptops and even to supercomputers, data is being collected and produced. With so many bits and bytes, data analytics and data mining play unprecedented roles in computing. Linear algebra is an important tool in this field. In this chapter, we touch on some tools in data mining that use linear algebra, many built on ideas presented earlier in the book.
Before we start, how much data is a lot of data? Let's look to Facebook. What were you doing 15 minute ago? In that time, the number of photos uploaded to Facebook is greater than the number of photographs stored in the New York Public Library photo archives. Think about the amount of data produced in the past two hours or since yesterday or last week. Even more impressive is how Facebook can organize the data so it can appear quickly into your news feed.
Slice and Dice
In Section 8.3, we looked at clustering and saw how to break data into two groups using an eigenvector. As we saw in that section, it can be helpful, and sometimes necessary for larger networks, to plot the adjacency matrix of a graph. In Figure 11.1, we see an example where a black square is placed where there is a nonzero entry in the matrix and a white square is placed otherwise.
The goal of clustering is to find maximally intraconnected components and minimally interconnected components. In a plot of a matrix, this results in darker square regions. We saw this for a network of about fifty Facebook friends in Figure 8.5 (b). Now, let's turn to an even larger network. We'll analyze the graph of approximately 500 of my friends on Facebook. We see the adjacency matrix visualized in Figure 11.2 (a). Here we see little organization or a pattern of connectivity with my friends. If we partition the group into two clusters using the Fiedler method outlined in Section 8.3, after reordering the rows and columns so clusters appear in a group, we see the matrix in Figure 11.2 (b).
- Type
- Chapter
- Information
- When Life is LinearFrom Computer Graphics to Bracketology, pp. 106 - 117Publisher: Mathematical Association of AmericaPrint publication year: 2015