Copy entire Word content in Excel, one paragraph by one

Ok, I had to deal with a Word document consisting in normal paragraphs and tables, and I wanted it on Excel.

But tables have several columns while paragraph is just monocolumn, so I needed to get them separated but with some order.

Here is the initial macro to separate contents:

Option Explicit

Sub WordImport()
    Dim wdApp As Word.Application
    Dim wdDoc As Word.Document
    Dim wdParag As Word.Paragraph
    Dim wdTable As Word.Table
    Dim wdRow As Word.Row
    Dim wdColumn As Word.Column
    Dim aWidth() As Single
    Dim oWsh As Excel.Worksheet
    Dim wdRng As Word.Range
    Dim wdRngTable As Word.Range
    Set wdApp = GetObject(, "Word.Application")
    'Set wApp = CreateObject("Word.Application")
    Set wdDoc = wdApp.ActiveDocument
    'Set wDoc = wApp.Documents.Open("C:\File.docx", ReadOnly:=True)

    Dim lgR As Long: lgR = 0
    Set oWsh = ActiveSheet
    With wdDoc
        Set wdRng = .Paragraphs(1).Range
        Do While wdRng.Paragraphs(1).Range.End <> .Range.End
            If Not wdRng.Information(wdWithInTable) Then
                'MsgBox wdRng.Text
                wdRng.Collapse wdCollapseEnd
                wdRng.MoveEnd wdParagraph, 1 'move one paragraph in text
                'Call sParagraphPaste(wdParag, oWsh, lgR)
            ' We have reached a table...
                Set wdRngTable = wdRng.Tables(1).Range
                'For Each wdColumn In wdRng.Tables(1).Columns
                '    Merge columns until oCell.Width >= wdColumn.Width
                'Next wdColumn
                ' Move to external procedure...
                Set oWsh = ThisWorkbook.Worksheets.Add
                Call sTablePaste(wdRngTable, oWsh, lgR)
                wdRngTable.Collapse wdCollapseEnd ' so initial and end points are equal
                wdRngTable.MoveEnd wdParagraph, 1
                Set wdRng = wdRngTable.Paragraphs(1).Range
            End If
        Loop 'Until wdRng.End = .Range.End
        'Check the last paragraph
        If Not wdRng.Information(wdWithInTable) Then
            'MsgBox wdRng.Text
        End If
    End With

End Sub

Private Sub sTablePaste(ByVal wdRngTable As Word.Range, _
                        ByVal oWsh As Excel.Worksheet, _
                        ByRef lgR As Long)
    'For Each wdTable In .Tables
    'Next wdTable
    'For Each wdParag In .Paragraphs
        'If wdParag.Range.Words.Count > 1 Then
            With oWsh
                .Range("A1").Offset(lgR, 0).Activate
                lgR = lgR + lgR
            End With
        'End If
    'Next wdParag
End Sub

Private Sub sParagraphPaste(ByVal wdParag As Word.Paragraph, _
                            ByVal oWsh As Excel.Worksheet, _
                            ByRef lgR As Long)
    'For Each wdParag In .Paragraphs
        'If wdParag.Range.Words.Count > 1 Then
            With oWsh
                .Range("A1").Offset(lgR, 0).Activate
                lgR = lgR + lgR
            End With
        'End If
    'Next wdParag
End Sub

And there is something more coming here, as we need to autofit rows to content, and they may be merged, so new problem.

A little piece of code to solve it (from, and contextures blog). The Small Man explores some handy alternatives to this topic in the workbook on that post.

Sub MergedAreaRowAutofit()
    Dim j As Long
    Dim n As Long
    Dim i As Long
    Dim MW As Double 'merge width
    Dim RH As Double 'row height
    Dim MaxRH As Double
    Dim rngMArea As Range
    Dim rng As Range
    Const SpareCol As Long = 26
    Set rng = Range("C10:O" & Range("C" & Rows.Count).End(xlUp).Row)
    With rng
        For j = 1 To .Rows.Count
            'if the row is not hidden
            If Not .Parent.Rows(.Cells(j, 1).Row).Hidden Then
                'if the cells have data
                If Application.WorksheetFunction.CountA(.Rows(j)) Then
                    MaxRH = 0
                    For n = .Columns.Count To 1 Step -1
                        If Len(.Cells(j, n).Value) Then
                            If .Cells(j, n).MergeCells Then
                                Set rngMArea = .Cells(j, n).MergeArea
                                With rngMArea
                                    MW = 0
                                    If .WrapText Then
                                        'get the total width
                                        For i = 1 To .Cells.Count
                                            MW = MW + .Columns(i).ColumnWidth
                                        MW = MW + .Cells.Count * 0.66
                                        'use the spare column
                                        'and put the value,
                                        'make autofit,
                                        'get the row height
                                        With .Parent.Cells(.Row, SpareCol)
                                            .Value = rngMArea.Value
                                            .ColumnWidth = MW
                                            .WrapText = True
                                            RH = .RowHeight
                                            MaxRH = Application.Max(RH, MaxRH)
                                            .Value = vbNullString
                                            .WrapText = False
                                            .ColumnWidth = 8.43
                                        End With
                                        .RowHeight = MaxRH
                                    End If
                                End With
                            ElseIf .Cells(j, n).WrapText Then
                                RH = .Cells(j, n).RowHeight
                                .Cells(j, n).EntireRow.AutoFit
                                If .Cells(j, n).RowHeight < RH Then .Cells(j, n).RowHeight = RH
                            End If
                        End If
                End If
            End If
    End With
End Sub

Hope you enjoy it

Subscripts: ₀₁₂₃₄₅₆₇₈₉₊₋₌₍₎
Superscripts: ⁰¹²³⁴⁵⁶⁷⁸⁹⁺⁻⁼⁽⁾

Sub TestScriptness()
    Dim wdDoc As Word.Document
    Dim aWords As Word.Words
    Dim myRange As Word.Range
    Dim myNext As Word.Range

    Set wdDoc = ThisDocument 'ActiveDocument
    For Each myRange In ThisDocument.Words
        If myRange.Text Like "*[a-z,A-Z]#[0-9,a-z,A-Z]*" Then
            Set myNext = myRange.Next(Unit:=wdWord, Count:=1)
            Call FormatScript(myRange, myNext)
        End If
Exit For
        myNext = myRange
    Next myRange
End Sub

Public Sub FormatScript(ByVal mySource As Word.Range, _
                        ByRef myDestination As Word.Range)
    Dim myChr As Word.Range
    Dim myChrDest As Word.Range
    Dim lgChr As Long
        lgChr = 0
        For Each myChr In mySource.Characters
            lgChr = lgChr + 1
            With myChr
                If lgChr <= myDestination.Characters.Count Then
                    Set myChrDest = myDestination.Characters(lgChr)
                End If
                If .Font.Superscript = True Then
                    ' for LaTeX:
                    '.Font.Superscript = False: .InsertBefore "^"
                    myChrDest.Font.Superscript = True
                End If
                If .Font.Subscript = True Then
                    ' for LaTeX:
                    '.Font.Subscript = False: .InsertBefore "_"
                    myChrDest.Font.Subscript = True
                End If
            End With
        'Set myRange = myRange.NextStoryRange
    'Loop Until myRange Is Nothing
End Sub

2 thoughts on “Copy entire Word content in Excel, one paragraph by one”

  1. I don’t know whether it’s just me or if everybody else experiencing issues with your blog.It appears as if some of the text on your posts arerunning off the screen. Can somebody else please provide feedback andlet me know if this is happening to them too?This may be a problem with my browser because I’ve hadthis happen before. Cheers

    1. In posts with RegEx expressions it could be happening (problem with the plugin I’m using to show code beautified). I’m a newbe on WordPress and don’t know too much how to tune the plugins, but with time will solve and move to the appropiate ones.
      What I can say, at least for this post, I can’t see that artifacts (at least with my config, Win10+Chrome).
      Anyway, downloading the page as Html and opening on a Text Editor will have the code on place.
      Kind regards.

Leave a Reply

Your email address will not be published. Required fields are marked *